Getting to know the Solaris iSCSI stack
iSCSI is rapidly gaining traction in the storage networking world. There are several reasons for this. First, the iSCSI protocol was developed in an open forum, which helps to ensure that iSCSI solutions from different vendors will interoperate with each other. Second, end nodes in iSCSI networks operate on block storage, which allows iSCSI storage to be managed by existing storage tools (e.g., storage monitoring tools, volume managers, file systems, etc.). Third, customers can reduce infrastructure costs by deploying iSCSI, since iSCSI can work over Ethernet networks, which are typically less expensive than their fibre channel counterpart. And finally, since iSCSI uses the TCP/IP protocol, customers can use existing management frameworks to monitor and manage their storage infrastructure.
Based on this rapid growth, several of the large storage vendors have extended their storage offerings to include iSCSI support. One vendor, Sun Microsystem, enhanced their Solaris operating system to support iSCSI. Solaris 10 currently ships with an iSCSI software initiator, and recent builds of Nevada (Nevada is the development version of Solaris, and will eventually become Solaris 11) contain an iSCSI target implementation.
This article will provide an introduction to iSCSI, and will describe how to set up a Solaris 10 host to act as an iSCSI initiator (the endpoint reponsible for initiating iSCSI requests-- i.e., the "client"), and a Nevada host to act as an iSCSI target (the endpoint responsible for receiving and processing iSCSI requests from one or more initiators -- i.e., the "server"). The article will also discuss some of the new iSCSI functionality that is available in recent builds of Nevada, to give storage administrators an idea of what is coming in future Solaris 10 updates.
iSCSI uses unique names to identify each target or initiator in a network element (a system capable of acting as an initiator or target). iSCSI names are unique to each node, and come in two formats: enterprise unique identifiers (EUI) and iSCSI qualified names (IQN). EUI addresses consists of 16 hexadecimal digits, and are prefixed by the string "eui." Here is an example of an iSCSI name in EUI format:
IQN formatted addresses contain a date string, the domain of a naming authority, a unique string to identify the node, and are prefixed with the string "iqn." Here is an example of an iSCSI name in IQN format:
The Solaris initiator can access targets that use EUI names, but defaults to assigning IQN names to each initiator and target (IQN names are assigned automatically when the iscsi initiator and target are initialized). Since Solaris uses IQN names by default, the rest of this article will use this naming method. If your interested in learning more about iSCSI naming conventions, please see the references for additional details.
The initiator and target in iSCSI contain one or more "portals." Each portal contains an IP address and port number, which initiators and targets use to determine the set of interfaces they can use to initiate or accept iSCSI connections on. All Connections between an iSCSI initiator portal and target portal are associated with a specific "session." iSCSI uses sessions to link logical connections together, and to ensure the ordered delivery of commands between initiators and targets. An initiator can create one or more sessions to a target, and each session can have one or more TCP connections associated with it. If a session contains more than one TCP connection, the session is referred to as a multiple connection session, or MC/S for short.
Solaris iSCSI software initiator configuration
Now that we provided a brief introduction to the important iSCSI concepts, let's hop right into configuring a Solaris 10 host to act as an initiator. The Solaris 10 initiator is controlled by the service management facilities iscsi_initiator service. This service is disabled by default, and will need to be enabled prior to configuring the initiator. To enable the iSCSI initiator service, the svcadm utility can be run with the "enable" command and the service name:
$ svcadm enable iscsi_initiator
After the iscsi_initiator service is enabled, the iscsiadm utility can be used to manage the configuration of the initiator. The iscsiadm utility takes a command as it's first argument, a subcommand to indicate what the command applies to as the second argument, and allows one or more options to be passed to the subcommand. The list of available commands can be viewed by running iscsiadm without any arguments:
$ iscsiadm Usage: iscsiadm -?,-V,--help Usage: iscsiadm add [-?] <OBJECT> [-?] [<OPERAND>] Usage: iscsiadm list [-?] <OBJECT> [-?] [<OPERAND>] Usage: iscsiadm modify [-?] <OBJECT> [-?] [<OPERAND>] Usage: iscsiadm remove [-?] <OBJECT> [-?] [<OPERAND>] For more information, please see iscsiadm(1M)
One of the most useful commands is "list," which can be used to list the configured discovery methods, as well as the configuration of an initiator or target. To use the list command to view the qualified name of an initiator, iscsiadm can be run with the list command and initiator-node subcommand:
$ iscsiadm list initiator-node Initiator node name: iqn.1986-03.com.sun:01:0003ba0e0795.4455571f Initiator node alias: - Login Parameters (Default/Configured): Header Digest: NONE/- Data Digest: NONE/- Authentication Type: NONE i RADIUS Server: NONE RADIUS access: unknown Configured Sessions: 1
The list initiator-node output displays the initiators IQN, the parameters to use during session establishment, and the number of sessions that will be used between the initiator and the target. We will see how this information is used a little bit later in the article.
In order for the Solaris initiator to use a target it needs to be configured with a discovery method -- a way of identifying targets on the network. Solaris supports three discovery methods: static discovery, SendTargets and iSNS.
With static discovery, the initiator is manually configured with a list of targets, and the portals the targets are presented through. Static discovery can be enabled by running iscsiadm with the "modify" command, the "discovery" subcommand, the "--static" option, and the keyword "enable":
$ iscsiadm modify discovery --static enable
Once static discovery is enabled, iscsiadm can be used to add targets to the host. To add a target with the IQN iqn.1986-03.com.sun:02:21947caf-20ca-c035-c95c-dbb96a87cf89.tigger that is presented through the network portal 192.168.1.13:3260, the iscsiadm utility can be run with the "add" command, the "static-config" subcommand, the IQN of the target to add, and the IP address and optional port number of the portal that is presenting the target:
$ iscsiadm add static-config iqn.1999-08.com.array:sn.01234567,192.168.1.3:3260
The second discovery method Solaris supports is SendTargets. SendTargets allows one or more network portals to be configured on an initiator, and the initiator will query these portals during a discovery session to locate targets that have been presented to it. To enable the SendTargets discovery method, iscsiadm can be run with the "modify" command, the "discovery" subcommand, the "--sendtargets" option, and the keyword "enable":
$ iscsiadm modify discovery --sendtargets enable
After SendTargets discovery is enabled, iscsiadm can be run with the "add" command, the "discovery-address" subcommand, and the IP addresses and optional port number of the portal(s) to query:
$ iscsiadm add discovery-address 192.168.1.13:3260
The final discovery method supported by Solaris is iSNS. iSNS allows the initiator to be configured with the IP address and port of an iSNS server. During the discovery phase, the iSCSI initiator will query the configured iSNS server for the list of portals and targets that have been allocated to the initiator. To enable iSNS discovery, iscsiadm can be run with the "modify" command, the "discovery" subcommand, the "--isns" option, and the keyword enable:
$ iscsiadm modify discovery --isns enable
After iSNS discovery is enabled, iscsiadm can be run with the "add" command, the "isns-server" subcommand, and the IP address and optional port of an iSNS server to use:
$ iscsiadm add isns-server 192.168.1.13:3205
Once the initiator is configured with a valid discovery method, the initiator should see one or more targets (assuming targets have been made available to the initiator) when the iscsiadm utility is run with the "list" command, the "target" subcommand, and optionally the "-v" (verbose) flag:
$ iscsiadm list target -vS Target: iqn.1986-03.com.sun:02:21947caf-20ca-c035-c95c-dbb96a87cf89.tigger Alias: tigger TPGT: 1 ISID: 4000002a0000 Connections: 1 CID: 0 IP address (Local): 192.168.1.3:32772 IP address (Peer): 192.168.1.13:3260 Discovery Method: SendTargets Login Parameters (Negotiated): Data Sequence In Order: yes Data PDU In Order: yes Default Time To Retain: 20 Default Time To Wait: 2 Error Recovery Level: 0 First Burst Length: 65536 Immediate Data: yes Initial Ready To Transfer (R2T): yes Max Burst Length: 262144 Max Outstanding R2T: 1 Max Receive Data Segment Length: 8192 Max Connections: 1 Header Digest: NONE Data Digest: NONE LUN: 1 Vendor: SUN Product: SOLARIS OS Device Name: /dev/rdsk/c1t010000CBC18475E900002A00457C908Dd0s2 LUN: 0 Vendor: SUN Product: SOLARIS OS Device Name: /dev/rdsk/c1t010000CBC18475E900002A00457C908Ad0s2
In the list output, we can see that two LUNs, LUN 0 and LUN 1, are presented through the target iqn.1986-03.com.sun:02:21947caf-20ca-c035-c95c-dbb96a87cf89.tigger. We can also see the list of parameters that were negotiated between the initiator and target, and the session id (ISID) that is associated with the session. Each iSCSI device can be managed identically to local disk devices. The format utility can be used to identify and partition iSCSI devices, newfs or mkfs can be used to create a file system on a partition, and mount can be used to mount the file system for general purpose use. For additional information on using iscsiadm(1m), please see the Solaris manual page.
Solaris iSCSI target configuration
In recent releases of Nevada, an iSCSI target implementation was integrated. The iSCSI target is managed by the service management facility, and like the iSCSI initiator, is not enabled by default. To enable the iSCSI target, the svcadm utility can be run with the "enable" option and the target's SMF service name:
$ svcadm enable iscsitgt
iscsitadm use an expression syntax similar to iscsiadm. Commands are used to indicate the action to perform, subcommands control what that the action is applied to, and one or more options can be passed to the subcommand. To view the list of commands, iscsitadm can be run without any options:
$ iscsitadm Usage: iscsitadm -?,-V,--help Usage: iscsitadm create [-?] <OBJECT> [-?] [<OPERAND> Usage: iscsitadm list [-?] <OBJECT> [-?] [<OPERAND> Usage: iscsitadm modify [-?] <OBJECT> [-?] [<OPERAND> Usage: iscsitadm delete [-?] <OBJECT> [-?] [<OPERAND> Usage: iscsitadm show [-?] <OBJECT> [-?] [<OPERAND> For more information, please see iscsitadm(1M)
To begin using the iSCSI target, a base directory needs to be created. This directory is used to persistently store the target and initiator configuration that is added through the iscsitadm utility. To create the base directory, iscsitadm can be run with the "modify" command, the "admin" subcommand, the "-d" option, and the directory to store the configuration:
$ iscsitadm modify admin -d /etc/iscsi
Each target will present one or more block devices to initiators, which will require the system acting as the target to have one or more free block devices available, or enough free space to store one or more files that act as the backing store. If block devices are used, they can take three forms:
- A single device, which is made available to the iSCSI target through an entry in /dev/dsk/ (e.g., /dev/dsk/c2t0d0)
- A Solaris Volume Manager meta device, which is made available to the iSCSI target through an entry in /dev/md/dsk/ (e.g., /dev/md/dsk/d100)
- A pseudo-volume in a ZFS pool, which is made available to the iSCSI target through an entry in /dev/zvol/dsk/<dataset name>/ (e.g., /dev/zvol/dsk/stripedpool/iscsivol000)
ZFS provides end-to-end data protection, data compression, and the ability to automatically share out ZFS volumes as iSCSI targets, so we will use ZFS block devices (zvols for short) in our examples. To create a ZFS volume for use with iSCSI, a ZFS pool will need to be identified to back the volume If a ZFS pool is not available, one can be created by first choosing a RAID protection level (ZFS supports RAID0, RAID1, RAIDZ, and RAIDZ2), and then invoking the zpool utility with the "create" option, the name of the pool to create, and the devices to add to the pool:
$ zpool create stripedpool c0d1 c1d0 c1d1
After the pool is created, the zfs utility can be used to create ZFS volumes. To create two zvols each 1GB in size, the zfs utility can be run with the "create" subcommand, the -V" option to indicate that a volume should be created, the size of the volume, and the location in the pool to store the volume:
$ zfs create -V 1g stripedpool/iscsivol000 $ zfs create -V 1g stripedpool/iscsivol001
I could have also included the "-s" option to create a sparse volume. Sparse volumes will not be allocated any storage up front, but will grow to the specified size as data blocks in the volume are written. This allows storage be be oversubscribed, or in storage jargon, the storage can be "thinly provisioned."
Once the volumes are created, they need to be exported to an initiator. This can be done with the ZFS shareiscsi property, or through the iscsitadm utility. To use iscsitadm, the command can be run with the "create" command, the "target" subcommand, the block device to use, and a name to assign to the target ( if the target specified already exists, the device is presented as a LUN behind that target). The following example creates two targets, and associates the zvols we created above with the new targets:
$ iscsitadm create target -b /dev/zvol/dsk/stripedpool/iscsivol000 tigger-tgt0 $ iscsitadm create target -b /dev/zvol/dsk/stripedpool/iscsivol001 tigger-tgt1
After the targets are created, iscsitadm's "list" command and "target" subcommand can be used to display the targets and their properties:
$ iscsitadm list target -v Target: tigger iSCSI Name: iqn.1986-03.com.sun:02:21947caf-20ca-c035-c95c-dbb96a87cf89.tigger Connections: 0 ACL list: TPGT list: LUN information: LUN: 0 GUID: 0 VID: SUN PID: SOLARIS Type: disk Size: 1.0G Backing store: /dev/zvol/dsk/stripedpool/iscsivol000 Status: online LUN: 1 GUID: 0 VID: SUN PID: SOLARIS Type: disk Size: 1.0G Backing store: /dev/zvol/dsk/stripedpool/iscsivol001 Status: online
To ensure that storage resources are accessed by authorized initiators, an ACL can be created on the target to limit which initiator IQNs can access the target, and the CHAP protocol can be configured to authenticate the initiator and target. To simplify the management of ACLs, each IQN can be assigned an alias. This allows a descriptive name to be assigned to each IQN, which makes ACLs easier to interpret and manage. To assign the alias "tigger" to the IQN "iqn.1986-03.com.sun:01:0003ba0e0795.4455571f," the iscsitadm utility can be run with the "create" command, the initiator subcommand, an IQN, and the alias to associate with that IQN:
$ iscsitadm create initiator -n iqn.1986-03.com.sun:01:0003ba0e0795.4455571f tigger
After the alias is created, it can be added to a target's ACL list by running iscsitadm with the "modify" command, the target subcommand, the "-l" option, the alias to add, and the name of the target to modify:
$ iscsitadm modify target -l tigger tigger
To display the list of ACLs assigned to a target, iscsitadm can be run with the "list" command and target subcommand:
$ iscsitadm list target -v | egrep '(Target|ACL|Init)' Target: tigger ACL list: Initiator: tigger
Once the targets and ACLs are setup, an initiator can be configured to use the targets and LUNs that have been allocated to it. The section on configuring the Solaris initiator describes how to set up a Solaris 10 initiator to use the LUNs we presented above.
iSCSI performance considerations
The Solaris iSCSI stack is built to perform, and can easily saturate multiple gigabit Ethernet links if configured correctly. When deploying high performance iSCSI solutions, there are several items that should be considered prior to choosing a network and storage architecture:
- Network infrastructure considerations:
- Jumbo frames: Using Ethernet jumbo frames can improve throughput between initiators and targets, and can reduce CPU utilization since fewer Ethernet frames need to be transmitted.
- Link aggregations: Link aggregation technologies such as 802.3ad and Cisco's etherchannel can be used to aggregate multiple physical interfaces into one or more logical interfaces. This can often improve performance, since multiple links can be used to send and receive data.
- Gigabit Ethernet: Gigabit Ethernet or comparable high speed network interconnect technologies should be used to improve throughput.
- Dedicated storage networks: Isolating storage traffic on to it's own network can improve performance and security, and will lesson the potential issues that come with using jumbo frames on public networks.
- Hardware considerations:
- iSCSI TCP/IP offload engines (TOEs) and iSCSI HBAs: TOEs and iSCSI HBAs allow iSCSI, TCP, IP, and physical and data link processing to be offloaded to hardware specifically designed for this purpose.
- Use enterprise class Ethernet adaptors: Enterprise grade adaptors typically have larger TX and RX ring buffers, and support hardware checksumming, segmentation offload, hardware packet classification, scatter gather, jumbo frames, advanced interrupt processing (e.g., MSI and MSI-X interrupts), and the latest high performance bus technologies (e.g., PCI-X and PCI express).
- Operating system considerations:
- Ensure that TCP/IP send and receive buffers are tuned for the workload, and the sliding window algorithms have been adjusted to optimize data throughput. The ttcp, netperf and iperf utilities can assist with this, and links to each tool are provided in the reference section.
For latency sensitive and throughput intensive workloads, the list of considerations above may not be enough to get your applications to perform adequately. Understanding the I/O patterns of your applications should be the first step taken in planning iSCSI storage infrastructure, and in some cases alternative storage interconnects (e.g., fibre channel) may be required. For additional information on performance and determining application I/O patterns, please see the references.
As iSCSI solutions penetrate further into the enterprise, the need to deploy highly available iSCSI solutions will become a necessity. Highly available iSCSI solutions can be deployed through the use of IP mulitpathing and link aggregation software (e.g., Solaris IPMP, 802.3ad link aggregation, etc.), storage multipathing software (e.g., Solaris traffic manager), as well as through the use of multiple sessions and multiple connections per session. The Sun blueprint "Using iSCSI multipathing in the Solaris 10 operating system" describes these topics in detail, and a reference to the blueprint is provided in the reference section.
When issues arise on iSCSI networks, it is important to have tools available to quickly troubleshoot and isolate the source of the problem. Currently the best non commercial tool for debugging iSCSI problems is the opensource protocol analysis tool wireshark (formerly called ethereal). Wireshark contains protocol dissectors for the iSCSI protocol, which can be useful for debugging network and performance problems. For debugging server side issues on Solaris hosts, DTrace and truss can be valuable for locating contention points, and the source of that contention. Another great outlet for debugging problems is the opensolaris storage list. The individuals who wrote the iSCSI software are members of this list, and are quick to answer questions pertaining to the Solaris iSCSI stack.
Future iSCSI work
The opensolaris storage community has been extremely busy over the past year, and the Solaris kernel developers are working to integrate more iSCSI functionality into Nevada, and eventually a Solaris update. The following features are being worked on to enhance debugging, simplify administration, and to increase availability:
- A DTrace provider for the iSCSI target
- A standalone iSNS server
- iSNS support in the iSCSI target
- Sun cluster support for the iSCSI target
- Multiple connections per session support for the iSCSI target
This article touched on iSCSI, and showed how to use the Solaris initiator and target. Three areas that I touched on in less detail are iSCSI security, scalability, high availability and performance tuning. These topics have received a fair amount of coverage in various storage communities, and the references section contains pointers to presentations and links on these topics. The initiator examples were tested on a Solaris 10 host running the 11/06 release, and the target examples were tested on a host running build 53 of Nevada. If you have questions or comments on the article, please feel free to e-mail the author.
The following references were used while writing this article:
- Observing I/O behavior with the DTraceToolkit
- iPerf network bandwidth tester
- iSCSI mulipathing
- iSCSI RFC
- iSCSI security
- SCSI protocol specifications
Ryan would like to thank Adam Leventhal for taking the time to review this article. Ryan would also like to thank the Solaris kernel developers and the opensolaris storage community for their contributions to the Solaris storage stack.* Originally published in the August '07 issue of SysAdmin Magazine