Being able to boot a machine from SAN isn’t exactly a new concept. Instead of having local hard drives in thousands of machines, each machine logs into the fabric and boots the O/S from a LUN exported via fiber on the SAN. This requires a little bit of configuration on the Fiber HBA, but it has the advantage of no longer dealing with local disk failure.
In OpenSolaris Navada build 104 on x86 platforms, iSCSI boot was incorporated.
If you have a capable NIC, you can achieve the same results of “boot from SAN” as fiber, but without the additional costs of an expensive fiber SAN network. Think of the possibilities here —
Implement a new AmberRoad Sun Storage 7000 series NAS device like the 7410 exporting hundreds iSCSI targets for each of your machines, implement ZFS Volumes on the backend, and leverage the capability of ZFS snapshots, clones, etc with your iSCSI root file system volumes for your machines. Even if your “client” machine mounts a UFS root filesystem over iSCSI, the backend would be a ZFS volume.
Want to provision 1000 machines in a day? Build one box, ZFS snapshot/clone the volume, and create 1000 iSCSI targets. Now the only work comes in configuring the OpenSolaris iSNS server with initiator/target parings. Instant O/S provisioning from a centrally managed location.
Implement two Sun Storage 7410 with clustering, and now you have a HA solution to all O/Ses running in your datacenter.
This is some pretty cool technology. Now, you have only one machine to replace disk failures at, instead of thousands, at a fraction of the cost it would take to implement this on Fabric! Once this technology works out the kinks and becomes stable, this could be the future of server provisioning and management.
I manage a number of Brocade switches, and periodically they encounter hardware problems (bad SPFs, faulty cables, etc.). To ensure that I am notified when these problems occur, I like to configure the switches to log errors to a centralized syslog server. Configuring a Brocade switch to use syslog is as simple as running the syslogdIpAdd command with the IP address of the syslog server:
switch1:admin> syslogdIpAdd “192.168.23.138”
switch1:admin> syslogdIpAdd “192.168.23.139”
Once one or more syslog servers are configured, the syslogdIpShow command can be used to verify that the servers were added:
If everything went smoothly, you should see entries similar to the following on the syslog server:
Jul 15 11:16:14 switch1 0 0x102d7800 (tShell): Jul 15 10:29:57
Jul 15 11:16:14 switch1 INFO SYS-LOGCLRD, 4, Error log cleared
Jul 15 11:16:14 switch1
The Brocade Silkworm family of switches maintain a number of important performance and reliability counters for each switch port. Performance and reliability counters for specific switch ports can be viewed with the porterrshow, portshow and portStatsShow commands. To view system wide throughput statistics, the portPerfShow command can be run with an optional interval:
switch1:admin> portPerfShow 5
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Total
0 0 21m 28m 31m 0 8.4m 0 28m 21m 31m 0 8.4m 0 0 0 178m
0 0 20m 29m 31m 0 10m 0 29m 20m 31m 0 10m 0 0 0 182m
0 0 18m 36m 31m 0 14m 0 36m 18m 31m 0 14m 0 0 0 201m
0 0 17m 34m 30m 0 7.0m 0 34m 17m 31m 0 7.0m 0 0 0 179m
The individual switch ports are listed above the dashes, and the numbers below the dashes contain the number of bytes transmitted and received per second. This is a super useful tool for getting a high level overview of what your switch is doing.
Brocade switches have become one of the most widely deployed componets in most Storage Area Networks (SANs). One thing that has led to Brocade’s success is their robust CLI, which allow you to view and modify almost every aspect of their switch. This includes zoning configurations, SNMP attributes, domain ids, switch names and network addresses, etc. All of this configuration information is necessary for the switch to function properly, and should be periodically backed up to allow speedy recovery when disaster hits.
Each Brocade switch comes with the “configUpload” and “configDownload” commands to back up a switch configuration to a remote system, or to restore a configuration from a remote system. ConfigUplaod has two modes of oepration: interactive mode and automatic mode. To use the interactive mode to upload a config from a switch named switch1 to an ftp server with the IP address 22.214.171.124, configUpload can be run to walk you through backing up the configuration:
Server Name or IP Address [host]: 126.96.36.199
User Name [user]: matty
File Name [config.txt]: switch1_config.txt
Protocol (RSHD or FTP) [rshd]: ftp
After the configuration is uploaded, you will have a text file with you switches configuration on the remove server:
$ ls -l sw*
-rw-r--r-- 1 matty other 7342 Jul 7 09:15 switch1_config.txt
To restore a configuration, you can use the configDownload command. Both of these commands allow the paramters to be passed as arguments to the script, so they are ideal for automation (there is a backup script on the Brocade support site that can be used to automate configuration backups). In case others find it useful, I placed my Brocade cheat sheet on my website.
While perusing the Blackhat media archives I came across Himanshu Dwivedi’s iSCSI security presentation. This is not only a great introduction to the concepts behind iSCSI, but also discusses some of the wekanesses in the iSCSI protocol. This is a great read!
This past week I wanted to use the Solaris cfgadm utility to unconfigure a few LUNs. When I ran ‘cfgadm -al’, I noticed that the FC adaptors were not visible in the cfgadm output:
$ cfgadm -al
Ap_Id Type Receptacle Occupant Condition
c2 scsi-bus connected configured unknown
c2::dsk/c2t0d0 CD-ROM connected configured unknown
usb0/1 unknown empty unconfigured ok
usb0/2 unknown empty unconfigured ok
This seemed odd, since I could see the controllers in the vxdmpadm output:
$ vxdmpadm listctlr all
CTLR-NAME ENCLR-TYPE STATE ENCLR-NAME
c1 Disk ENABLED Disk
c0 Disk ENABLED Disk
c4 EMC ENABLED EMC0
c3 EMC ENABLED EMC0
c4 EMC_CLARiiON DISABLED EMC_CLARiiON0
c3 EMC_CLARiiON DISABLED EMC_CLARiiON0
Since the controllers in question were Emulex adaptors, I read through the Emulex admin guide and found that the platform and FC adaptor had to be DR aware to support configure/unconfigure operations. Since I couldn’t locate a “DR aware” label in our vendors documentation, I decided to open a ticket to see if the servers supported cfgadm. After a week or two of chatting with support, our vendor indicated that we would need to use the Sun Leadville drivers to configure and unconfigure LUNs with Emulex adaptors in Solaris systems. This was awesome news, and I am super happy that the lpfc driver will now be installed in /kernel/drv by default! Niiiiiiice.