Prefetch Technologies // Keeping your cache lines cozy

Archive

Posts in Storage

Viewing SCSI mode page data

storageOct 24, 2007 1 min read

I came across the sdparm utility while surfing the web last weekend. This super useful utility can be used to display and modify SCSI device parameters, and is the best tool I've found for dumping SCSI mode and VPD pages. I wish sdparm would have been around when I was reading through the SBC documentation on the T11 website. That would have been swell!

$ read more →

Monitoring the ZFS ARC cache

storageOct 21, 2007 1 min

The ZFS file system uses the adaptive replacement cache (ARC) to cache data in the kernel. Measuring ARC utilization is pretty straight forward, since ZFS populates a number of kstat values with usage data. Neelakanth Nadgir wrote a cool Perl script to summarize the ARC kstats, and a sample run is included below: Since numerous discussions have come up on zfs-discuss regarding ARC sizing (the size of the ARC is controlled by the zfs:zfs_arc_max tunable), folks will find the "arcsz" column extremely useful. Nice!

$ read more →

Debugging fibre channel errors on Solaris hosts

storageSep 1, 2007 2 min

While reviewing the system logs on one of my SAN attached servers last week, I noticed hundreds of entries similar to the following: Aug 28 13:10:14 foo scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci (scsi_vhci0): Aug 28 13:10:14 foo /scsi_vhci/ssd@g600a0b80001fcb370000010646d3d207 (ssd21): Command Timeout on path /pci@9,600000/lpfc@1/fp@0,0 (fp3) Aug 28 13:10:14 foo scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g600a0b80001fcb370000010646d3d207 (ssd21): Aug 28 13:10:14 foo SCSI transport failed: reason 'timeout': retrying command Since the errors were retryable, it looked like MPxIO was doing it's job and retrying requests on one of the other paths. To see if all of the paths were up and operational (the host has four paths to disk), I ran the mpathadm utility with the "list" command and the "LU" option: /dev/rdsk/c2t600A0B80001FCB370000010446D3D0C1d0s2 Total Path Count: 4 Operational Path Count: 4 /dev/rdsk/c2t600A0B8000216462000000C746D3DA48d0s2 Total Path Count: 4 Operational Path Count: 4 /dev/rdsk/c2t600A0B8000216462000000CD46D3DC1Cd0s2 Total Path Count: 4 Operational Path Count: 4 /dev/rdsk/c2t600A0B80001FCB370000010646D3D207d0s2 Total Path Count: 4 Operational Path Count: 4 /dev/rdsk/c2t600A0B8000216462000000CA46D3DB88d0s2 Total Path Count: 4 Operational Path Count: 4 Since all of the paths were available, I started to wonder if a cable was faulty. After running fcinfo on each of the four HBA ports, I came across the following: HBA Port WWN: 10000000c94708f2 OS Device Name: /dev/cfg/c6 Manufacturer: Emulex Model: LP9002L Firmware Version: 3.93a0 FCode/BIOS Version: 1.41a4 Type: N-port State: online Supported Speeds: 1Gb 2Gb Current Speed: 2Gb Node WWN: 20000000c94708f2 Link Error Statistics: Link Failure Count: 0 Loss of Sync Count: 14 Loss of Signal Count: 0 Primitive Seq Protocol Error Count: 0 Invalid Tx Word Count: 198724 Invalid CRC Count: 63412 Bingo! The CRC errors were continuosly increasing, so I knew that either the HBA or fibre channel cable were faulty (as a side note, I can't wait for the FMA project to harden the emlxs and qlc drivers!)…

$ read more →

Repairing the Solaris /dev and /devices directories

storageAug 18, 2007 1 min

The /devices and /dev directories on one of my Solaris 9 hosts got majorly borked a few weeks back, and the trusy old `devfsadm -Cv' command wasn't able to fix our problem. To clean up the device tree, I booted from CDROM into single user mode and manually cleaned up the device hierarchy. Here is what I did to fix my problems (WARNING: This fixed my problem, but there is no guarantee that this will work for you. Please test changes similar to this on non-production systems prior to adjusting production systems.): Step 1: Boot from CDROM into single user mode Step 2: Mount the "/" partition to your favorite place (if your boot devices are mirrored, you will need to perform the following operations on each half of the mirror): Step 3: Move the existing path_to_inst aside: Step 4: Clean out the /devices and /dev directories: Step 5: Replicate the /devices and /dev directories that were created during boot: Step 6: Adjust the vfstab to reflect any device changes Step 7: Boot with the "-a", "-s" and "-r" options to create a new path_to_inst (you can optionally use `devfsadm -C -r /a -p /a/etc/path_to_inst -v' to create the path_to_inst from single user mode), and to add device entries that weren't found while booted from single user mode Step 8: Grab a soda and enjoy the fruits of your labor…

$ read more →

Expanding storage the ZFS way

storageAug 18, 2007 1 min

I had a mirrored ZFS pool fill up on me this week, which required me to add additional storage to ensure that my application kept functioning correctly. Since expanding storage is a trivial process with ZFS, I decided to increase the available pool storage by replacing the 36GB disks in the pool with 72GB disks. Here is the original configuration: Filesystem size used avail capacity Mounted on netbackup 33G 32G 1G 96% /opt/openv pool: netbackup state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM netbackup ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 errors: No known data errors To expand the available storage, I replaced the disk c1t2d0 with a 72GB disk, and then used the zpool "replace" option to replace the old disk with the new one: Once the pool finished resilvering (you can run `zpool status -v' to monitor the progress), I replaced the disk c1t3d0 with a 72GB disk, and used the zpool "replace" option to replace the old disk with the new one: Once the pool finished resilvering, I had an extra 36GB of disk space available: Filesystem size used avail capacity Mounted on netbackup 67G 32G 35G 47% /opt/openv This is pretty powerful, and it's nice not to have to run another utility to extend volumes and file systems once new storage is available. There is also the added benefit that ZFS resilvers at the object level, and not at the block level…

$ read more →