Monitoring pipe activity with pv

While perusing the catonmat blog, I came across a reference to the pv utility. Pv allows you to monitor the amount of data that is written to a pipe, which is incredibly useful for monitoring how much data is transferred between pipe endpoints. The following example shows this super useful tool in action:

$ dd if=/dev/zero | pv > foo
522MB 0:00:06 [ 109MB/s] [ <=> ]

When pv is added to the pipeline, you get a continuous display of the amount of data that is being transferred between two pipe endpoints. I really dig this utility, and I am stoked that I found the catonmat website! Niiiiiiiiice!

UPDATE:

Try using pv when sending stuff over the network using dd. Neato.

[root@machine2 ~]# ssh machine1 “dd if=/dev/VolGroup00/domU2migrate”|pv -s 8G -petr|dd of=/dev/xen02vg/domU2migrate
0:00:30 [11.2MB/s] [====> ] 4% ETA :10:13

Want to rate limit the transfer so you don’t flood the pipe?

-L RATE, –rate-limit RATE
Limit the transfer to a maximum of RATE bytes per second. A suffix of “k”, “m”, “g”, or “t” can be added to denote kilobytes (*1024), megabytes, and so on.

-B BYTES, –buffer-size BYTES
Use a transfer buffer size of BYTES bytes. A suffix of “k”, “m”, “g”, or “t” can be added to denote kilobytes (*1024), megabytes, and so on. The default buffer size is the block size of the input file’s filesystem multiplied by 32 (512kb max), or 400kb if the block size cannot be determined.

Already have a transfer in progress and want to rate limit it without restarting?

-R PID, –remote PID
If PID is an instance of pv that is already running, -R PID will cause that instance to act as though it had been given this instance’s command line instead. For example, if pv -L 123k is running with process ID 9876, then running pv -R 9876 -L 321k will cause it to start using a rate limit of 321k instead of 123k. Note that some options cannot be changed while running, such as -c, -l, and -f.

Bash tips

I read through the bash tips on the hacktux website, which brought to light the fact that you can do basic integer math in your bash scripts. This is easily accomplished by using dual parenthesis similar to this:

four=$(( 2 + 2 ))
echo $four

This is good stuff, and I need to replace some old `bc …` and `expr ..` statements with this.

Tracing block I/O operations on Linux hosts with blktrace

I’ve been spending a bunch of time with Linux lately, and have found some really nifty tools to help me better manage the systems I support. One of these tools is blktrace, which allows you to view block I/O operations to the devices connected to your system. To see just how useful this utility is, I present to you with the following example:

disarm:~# blktrace -d /dev/sda -o – | blkparse -i –

  8,0    0        1     0.000000000   783  A   W 36744519 + 8 <- (8,1) 36744456
  8,0    0        2     0.000000472   783  Q   W 36744519 + 8 [kjournald]
  8,0    0        3     0.000012413   783  G   W 36744519 + 8 [kjournald]
  8,0    0        4     0.000016614   783  P   N [kjournald]
  8,0    0        5     0.000018272   783  I   W 36744519 + 8 [kjournald]
  8,0    0        6     0.000026492   783  A   W 36744527 + 8 <- (8,1) 36744464
  8,0    0        7     0.000026784   783  Q   W 36744527 + 8 [kjournald]
  8,0    0        8     0.000028877   783  M   W 36744527 + 8 [kjournald]
  8,0    0        9     0.000031539   783  A   W 36744535 + 8 <- (8,1) 36744472
  8,0    0       10     0.000031752   783  Q   W 36744535 + 8 [kjournald]
  8,0    0       11     0.000032234   783  M   W 36744535 + 8 [kjournald]
  8,0    0       12     0.000033151   783  A   W 36744543 + 8 <- (8,1) 36744480
  8,0    0       13     0.000033343   783  Q   W 36744543 + 8 [kjournald]
  8,0    0       14     0.000033758   783  M   W 36744543 + 8 [kjournald]
  8,0    0       15     0.000034477   783  A   W 36744551 + 8 <- (8,1) 36744488
  8,0    0       16     0.000034670   783  Q   W 36744551 + 8 [kjournald]
  8,0    0       17     0.000035082   783  M   W 36744551 + 8 [kjournald]
  8,0    0       18     0.000035786   783  A   W 36744559 + 8 <- (8,1) 36744496
  8,0    0       19     0.000035972   783  Q   W 36744559 + 8 [kjournald]
  8,0    0       20     0.000036381   783  M   W 36744559 + 8 [kjournald]
  8,0    0       21     0.000037124   783  A   W 36744567 + 8 <- (8,1) 36744504

  < ..... >

^CCPU0 (8,0):
 Reads Queued:           0,        0KiB  Writes Queued:          13,       52KiB
 Read Dispatches:        0,        0KiB  Write Dispatches:        2,       52KiB
 Reads Requeued:         0               Writes Requeued:         0
 Reads Completed:        0,        0KiB  Writes Completed:        2,       52KiB
 Read Merges:            0,        0KiB  Write Merges:           11,       44KiB
 Read depth:             0               Write depth:             1
 IO unplugs:             2               Timer unplugs:           0

Throughput (R/W): 0KiB/s / 52000KiB/s
Events (8,0): 49 entries
Skips: 0 forward (0 -   0.0%)



As you can see from the above output, each I/O is printed along with a summary of the operations and how they were processed by the I/O scheduler. This is super information, which can be used to figure out I/O patterns (random reads, random writes, etc.), the size of the I/O operations hitting physical devices in a system and the type of workload on a system. I used to think that Solaris (w/ DTrace) had a serious leg up on Linux, but I’m starting to realize that you can understand Linux performance just as well if you install one or more third party tools. I’ve come across a slew of nifty utilities over the past month, and will start posting regularly with example output and links to each utility.

pbcopy / pbpaste in OS X

I came across a nifty utility in OS X that allows you to copy / paste data to/from the clipboard without having to select text and command+c/command+v

Michael-MacBook-Pro:~
(michael)> echo foo | pbcopy

Michael-MacBook-Pro:~
(michael)> pbpaste
foo

Thats kind of neat.  What about connecting to a remote machine, executing some command, and then having that output in the clipboard of your local machine?

Michael-MacBook-Pro:~
(michael)> ssh somehost.com uptime | pbcopy

Michael-MacBook-Pro:~
(michael)> pbpaste
5:26pm  up 64 day(s),  8:30,  4 users,  load average: 0.04, 0.04, 0.04

Easily encoding documents prior to publishing them to the web

While reviewing the command lines on commandlinefu, I came across this nifty little gem:

$ perl -MHTML::Entities -ne ‘print encode_entities($_)’ FILETOENCODE

This command line snippet takes a file as an argument, and escapes all of the characters that can’t be directly published on the web (i.e., it will convert a right bracket to & gt). This is super useful, and is a welcome addition to the nifty tidy utility!

ZFS dataset and volume properties

A  few chapters of the upcoming OpenSolaris Bible have been released.  Specifically, looking through chapter 8 on ZFS, I came across this handy list of properties and their descriptions.

File System Properties
aclinherit– Inheritance of ACL entries
aclmode– Modification of ACLs in a chmod(2) operation
atime– Whether access times of files are updated when read
available– Space available to the file system
canmount– Whether the file system is mountable
casesensitivity– Case sensitivity of filename matching
checksum– Checksum algorithm for data integrity
compression– Compression algorithm
compressratio– Compression ratio achieved
copies– Number of data copies stored
creation– Time the file system was created
devices– Whether device nodes can be opened
exec– Whether processes can be executed
mounted– Whether the file system is mounted
mountpoint– Mount point for the file system
nbmand– Use of nonblocking mandatory locks with CIFS
normalization– Use Unicode-normalized filenames in name comparisons
origin– Snapshot on which a clone is based
primarycache– Controls whether ZFS data and metadata are cached in the primary cache
quota– Limit on space that the file system can consume
readonly– Whether the file system can be modified
recordsize– Suggested block size for files
referenced– Amount of data accessible within the file system
refquota– Space limit for this file system
refreservation– Minimum space guaranteed to the file system
reservation– Minimum space guaranteed to the file system and descendants
secondarycache– Controls whether ZFS data and metadata are cached in the secondary cache
setuid– Allow setuid file execution
shareiscsi– Export volumes within the file system as iSCSI targets
sharenfs– Share the file system via NFS
sharesmb– Share the file system via CIFS
snapdir– Whether the .zfs directory is visible
type– Type of dataset
used– Space consumed by the file system and descendants
usedbychildren– Space freed if children of the file system were destroyed
usedbydataset– Space freed if snapshots and refreservation were destroyed, and contents of the file system were deleted
usedbyrefreservation– Space freed if the refreservation was removed
usedbysnapshots– Space freed if all snapshots of the file system were destroyed
utf8only– Use only UTF-8 character set for filenames
version– On-disk version of the file system
vscan– Whether to scan regular files for viruses
xattr– Whether extended attributes are enabled
zoned– Whether the file system is managed from a nonglobal zone

Volume Properties
available– Space available to the volume
checksum– Checksum algorithm for data integrity
compression– Compression algorithm
compressratio– Compression ratio achieved
copies– Number of data copies stored
creation– Time the volume was created
origin– Snapshot on which the clone is based
primarycache– Controls whether ZFS data and metadata are cached in the primary cache
readonly– Whether the volume can be modified
referenced– Amount of data accessible within the volume
refreservation– Minimum space guaranteed to the volume
reservation– Minimum space guaranteed to the volume and descendants
secondarycache– Controls whether ZFS data and metadata are cached in the secondary cache
shareiscsi– Export the volume as an iSCSI target
type– Type of dataset
used– Space consumed by the volume and descendants
usedbychildren– Space freed if children of the volume were destroyed
usedbydataset– Space freed if snapshots and refreservation were destroyed, and contents of the volume were deleted
usedbyrefreservation– Space freed if the refreservation was removed
usedbysnapshots– Space freed if all snapshots of the volume were destroyed
volblocksize– Block size of the volume
volsize– Logical size of the volume

Some of these properties were new to me, as they probably only exist in later versions of ZFS in OpenSolaris.   Specifically, vscan, to scan files for viruses is interesting.  I’m wondering where virus definations are stored and updated.  This is actually a pretty nifty feature if you plan on using Solaris’ new in-kernel SMB server to share data with Microsoft Windows based clients.

UPDATE:  Richard provided via a comment an awesome link that shows how to administrate the new CIFS server within OpenSolaris, as well as how to execute the virus scan using the vscanadm utility.  Take a look at the slides on that link for an in-depth administrative tour of these features.

I’d like to learn more about the primarycache and secondarycache settings — exactly what gets tuned when fiddling around with these.

Also, a property called “copies” which allows you to specifiy how many copies of the data should be kept on disk.   I’m not sure exactly why you would want to increase the number of copies of data instead of using raidz, raidz2, mirroring, hot spares, etc. but its neat that the option is there.