Zoning Brocade switches: Putting it all together

I wanted to conclude my Brocade zoning posts by discussing a couple of best practices. Two issues I have seen in the real world are inconsistent and non-descriptive names, and a lack of configuration backups. Using descriptive names such as “Fabric1Switch1Port8” or “AppServer1Port1” makes the output quite a bit more readable, which is extremely helpful when you are trying to gauge the impact of a faulty initiator or SFP at 3am.

Backing up the configuration on a switch is super easy to do, and there are a number of tools available to automate this process (I have written pexpect scripts to do this). To perform a manual backup of a switch configuration, you can run the “configupload” utility:

Fabric1Switch1:admin> configupload
Server Name or IP Address [host]: 192.168.1.125
User Name [user]: matty
File Name [config.txt]: switch1config.txt
Protocol (RSHD or FTP) [rshd]: ftp
Password:
upload complete

This will prompt you for the IP of a server to write the configs to, as well as the name of the file to write the configuration to. I have thoroughly enjoyed working with Brocade switches over the past few years, and really enjoy the simplicity and power that they provide in their CLI. Nice!

Adjusting how often the Linux kernel checks for MCEs

I wrote about Linux mcelog utility a few weeks back, and described how it can be used to monitor the /dev/mcelog device for machine check exception (MCEs). By default, the Linux kernel will check for MCEs every five minutes. The polling interval is defined in the sysfs check_interval entry, which you can view with cat:

$ cat /sys/devices/system/machinecheck/machinecheck0/check_interval
12c

$ python
>>> print “%d” % 0x12c
300

To configure the host to use a shorter check interval, you can echo the desired value to the sysfs entry for processor 0:

$ echo 60 > /sys/devices/system/machinecheck/machinecheck0/check_interval

$ cat /sys/devices/system/machinecheck/machinecheck0/check_interval
3c

$ cat /sys/devices/system/machinecheck/machinecheck1/check_interval
3c

If you want to get additional information on check_interval, check out the machinecheck text file in the kernel documentation directory. If you are curious how the code actually detects a MCE, you can look through the source code in <KERNEL_SOURCE_ROOT>/arch/x86/kernel/cpu/mcheck.

Understanding the Linux /boot directory

When I first began using Linux quite some time ago, I remember thinking to myself WTF is all this stuff in /boot. There were files related to grub, a file called vmlinuz, and several ASCII text files with cool sounding names. After reading through the Linux kernel HOWTO, the /boot directory layout all came together, and understanding the purpose of each file has helped me better understand how things work, and allowed me to solve numerous issues in a more expedient manner. Given a typical CentOS or Fedora host, you will probably see something similar to the following in /boot:

$ cd /boot

$ tree

.
|-- System.map-2.6.29.5-191.fc11.x86_64
|-- System.map-2.6.30
|-- config-2.6.29.5-191.fc11.x86_64
|-- config-2.6.30
|-- efi
|   `-- EFI
|       `-- redhat
|           `-- grub.efi
|-- grub
|   |-- device.map
|   |-- e2fs_stage1_5
|   |-- fat_stage1_5
|   |-- ffs_stage1_5
|   |-- grub.conf
|   |-- iso9660_stage1_5
|   |-- jfs_stage1_5
|   |-- menu.lst -> ./grub.conf
|   |-- minix_stage1_5
|   |-- reiserfs_stage1_5
|   |-- splash.xpm.gz
|   |-- stage1
|   |-- stage2
|   |-- ufs2_stage1_5
|   |-- vstafs_stage1_5
|   `-- xfs_stage1_5
|-- initrd-2.6.29.5-191.fc11.x86_64.img
|-- initrd-2.6.30.img
|-- vmlinuz-2.6.29.5-191.fc11.x86_64
`-- vmlinuz-2.6.30



For each kernel release, you will typically see a vmlinuz, System.map, initrd and config file. The vmlinuz file contain the actual Linux kernel, which is loaded and executed by grub. The System.map file contains a list of kernel symbols and the addresses these symbols are located at. The initrd file is the initial ramdisk used to preload modules, and contains the drivers and supporting infrastructure (keyboard mappings, etc.) needed to manage your keyboard, serial devices and block storage early on in the boot process. The config file contains a list of kernel configuration options, which is useful for understanding which features were compiled into the kernel, and which features were built as modules. I am going to type up a separate post with my notes on grub, especially those related to solving boot related issues.

Listing file system lock files on Linux hosts

I mentioned in a previous post that I was using the Linux flock utility to ensure that only one copy of yum would run at any given point in time (well, theoretically someone could call yum from outside of the script, but there are only so many use cases you can protect against). The lock files that are created by flock reside on a file system, and can be viewed with the lslk utility:

$ lslk

SRC        PID DEV     INUM SZ TY M ST WH END LEN NAME
(unknown) 1536 8,1   927544     w 0  0  0   0   0 / (rootfs)
atd       1785 8,1   927573  5  w 0  0  0   0   0 /var/run/atd.pid
(unknown) 2034 8,1 14963655     w 0  0  0   0   0 / (rootfs)



If the file name doesn’t appear in the lslk output, you can use the find utilities “-inum” option (find a file by its inode number) to locate the file using the inode number listed in the 4th column:

$ find . -inum 14963655
./lock

If the process name doesn’t show up, you can use the lsof utility along with the name of the lock to see which process has the lock file open:

$ lsof | awk ‘$9 ~ /lock/ { print }’

test      2033      root  200w      REG                8,1        0   14963655 /tmp/lock
sleep     2035      root  200w      REG                8,1        0   14963655 /tmp/lock



I have been using lslk off and on for years, and it’s a SUPER useful tool for debugging issues with file system lock files. Nice!