Blog O' Matty


Listing packages that were added or updated after an initial Fedora or CentOS installation

This article was posted by Matty on 2009-09-13 10:14:00 -0400 -0400

I was reviewing the configuration of a system last week, and needed to find out which packages were added after the initial installation. The rpm utility has a slew of options (you can view the list of options by running rpm --querytags | more) to query the package database, including the extremely handy INSTALLTIME option. Using this query value along with my pkgdiff script, I was able to generate a list of packages that were installed (or updated) after the initial install:

$ pkgdiff

lsscsi-0.22-2.fc11.x86_64 was most likely added after the initial
install
xmms-1.2.11-5.20071117cvs.fc11.x86_64 was most likely added after the
initial install
gtk+-1.2.10-68.fc11.x86_64 was most likely added after the initial
install
rlog-1.4-5.fc11.x86_64 was most likely added after the initial
install
nx-3.3.0-35.fc11.x86_64 was most likely added after the initial
install
xmms-libs-1.2.11-5.20071117cvs.fc11.x86_64 was most likely added after
the initial install
tcl-8.5.6-6.fc11.x86_64 was most likely added after the initial
install
glib-1.2.10-32.fc11.x86_64 was most likely added after the initial
install
freenx-server-0.7.3-15.fc11.x86_64 was most likely added after the
initial install
xorg-x11-apps-7.3-8.fc11.x86_64 was most likely added after the
initial install
libmikmod-3.2.0-5.beta2.fc11.x86_64 was most likely added after the
initial install
fuse-encfs-1.5-6.fc11.x86_64 was most likely added after the initial
install
xorg-x11-fonts-misc-7.2-8.fc11.noarch was most likely added after the
initial install
expect-5.43.0-17.fc11.x86_64 was most likely added after the initial
install

Now this doesn’t take into account package updates, but it should be pretty easy to identify which items were added vs. updated with a couple more lines of shell script (you could cross reference the package list above with /root/install.log if you need to get super specific).

Why partition X does not end on cylinder boundary warnings don't matter

This article was posted by Matty on 2009-09-12 13:12:00 -0400 -0400

While reviewing the partion layout on one of my hard drives, I noticed a number of “Partition X does not end on cylinder boundary” messages in the fdisk output:

$ fdisk /dev/sda /dev/sda1 /dev/sda2 /dev/sda3 /dev/sda4

The number of cylinders for this disk is set to 9726.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
(e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): p

Disk /dev/sda: 80.0 GB, 80000000000 bytes
255 heads, 63 sectors/track, 9726 cylinders
Units = cylinders of 16065 512 = 8225280 bytes
Disk identifier: 0xac42ac42

Device Boot Start End Blocks Id System
/dev/sda1  1 26 204800 83 Linux
Partition 1 does not end on cylinder boundary.
/dev/sda2 26 287 2097152 83 Linux
Partition 2 does not end on cylinder boundary.
/dev/sda3 287 9726 75822111+ 8e Linux LVM

This was a bit disconcerting at first, but after a few minutes of thinking it dawned on me that modern systems use LBA (Logical Block Addressing) instead of CHS (Cylinder/Head/Sector) to address disk drives. If we view the partition table using sectors instead of cylinders:

$ sfdisk -uS -l /dev/sda /dev/sda1 /dev/sda2 /dev/sda3 /dev/sda4

Disk /dev/sda: 9726 cylinders, 255 heads, 63 sectors/track
Units = sectors of 512 bytes, counting from 0

Device Boot Start End #sectors Id System
/dev/sda1  63 409662 409600 83 Linux
/dev/sda2 409663 4603966 4194304 83 Linux
/dev/sda3 4603967 156248189 151644223 8e Linux LVM
/dev/sda4 0 - 0 0 Empty

We can see that we end at a specific sector number, and start the next partition at that number plus one. I must say that I have grown quite fond of sfdisk and parted, and they sure make digging through DOS and GPT labels super easy.

Better ZFS pool fault handling coming to an opensolaris release near you!

This article was posted by Matty on 2009-09-11 18:46:00 -0400 -0400

I just saw the following ARC case fly by, and this will be a welcome addition to the ZFS file system!:

OVERVIEW:

Uncooperative or deceptive hardware, combined with power failures or sudden lack of access to devices, can result in zpools without redundancy being non-importable. ZFS’ copy-on-write and Merkle tree properties will sometimes allow us to recover from these problems. Only ad-hoc means currently exist to take advantage of this recoverability. This proposal aims to rectify that short-coming.

PROPOSED SOLUTION:

This fast-track proposes two new command line flags each for the ‘zpool clear’ and ‘zpool import’ sub-commands.

Both sub-commands will now accept a ‘-F’ recovery mode flag. When specified, a determination is made if discarding the last few transactions performed in an unopenable or non-importable pool will return the pool to an usable state. If so, the transactions are irreversibly discarded, and the pool imported. If the pool is usable or already imported and this flag is specified, the flag is ignored and no transactions are discarded.

Both sub-commands will now also accept a ‘-n’ flag. This flag is only meaningful in conjunction with the ‘-F’ flag. When specified, an attempt is made to see if discarding transactions will return the pool to a usable state, but no transactions are actually discarded.

I have encountered errors where this feature would have been handy, and will be stoked when this feature is available in Solaris 10 / Solaris next.

Dealing with cron bad user messages on Solaris hosts

This article was posted by Matty on 2009-09-11 18:40:00 -0400 -0400

While reviewing the cron logs on one of my Solaris hosts, I noticed a number of entries similar to the following:

CMD: /opt/software/bin/arrecord -backup
> arr 20359 c Thu Sep 3 23:45:00 2009
! bad user (arr) Thu Sep 3 23:45:00 2009
< arr 20359 c Thu Sep 3 23:45:00 2009 rc=1

These errors are typically generated when the account the job run as doesn’t exist, or when the user’s shadow entry is locked (locked accounts have a LK in the /etc/shadow password field). In this specific case a password or NP entry (the account doesn’t have a password, and logins are denied) wasn’t assigned to the arr user, so the account was still listed in the locked state. Setting a strong password fixed the issue, and everything is working swimmingly!

Getting tape drive throughput and performance statistics on Linux hosts

This article was posted by Matty on 2009-09-05 13:44:00 -0400 -0400

I manager a number of Linux Netbackup media servers, and just recently learned that Linux doesn’t provide a tool to view tape statistics (it appears there are no /proc interfaces to retrieve SCSI tape drive performance data). Fortunately the SystemTap developers saw this glaring deficiency, and created the iostat-scsi.stp script to display statistics for each SCSI tape and disk device in a server. To use SystemTap on a Redhat, CentOS or Fedora Linux host, you will first need to install the kernel debuginfo files. Here are the commands I used to install the debuginfo RPMs on a Redhat Enterprise Linux machine (you can download the RHEL debuginfo files from the Redhat FTP server, and you can get the debuginfo files for CentOS and Fedora from one of the various mirrors):

$ ls -l

total 179240
-rw-r--r-- 1 matty matty 155787274 Sep 2 10:39
kernel-debuginfo-2.6.18-128.el5.x86_64.rpm
-rw-r--r-- 1 matty matty 27557888 Sep 2 10:39
kernel-debuginfo-common-2.6.18-128.el5.x86_64.rpm

$ rpm -ivh kernel

warning: kernel-debuginfo-2.6.18-128.el5.x86_64.rpm: Header V3 DSA
signature: NOKEY, key ID 37017186
Preparing...
###########################################
[100%]

1:kernel-debuginfo-common###########################################
[ 50%]
2:kernel-debuginfo
###########################################
[100%]

Once the debuginfo files are installed, you can download the iostat-scsi.stp script from the systemtap website. To use the script to monitor just tape devices, you can use the following command line (the script will print statistics for all block devices by default):

$ stap iostat-scsi.stp 5 | egrep '(Device|st)'

Device: tps blk_read/s blk_wrtn/s blk_read blk_wrtn
st1 199.20 0.00 407961.60 0 2039808
st0 103.60 0.00 212172.80 0 1060864
st0 141.00 0.00 288768.00 0 1443840
st1 221.00 0.00 452608.00 0 2263040
st0 162.80 0.00 333414.40 0 1667072
st1 182.00 0.00 372736.00 0 1863680
st1 197.60 0.00 404684.80 0 2023424

This will print the tape drive instance (st0 -> SCSI tape instance 0, st1 -> SCSI tape instance 1, etc.), the number of transactions per second, the blocks read and written per second, as well as the total number of blocks read and written. Systemtap is pretty cool, and I hope to publish a few scripts I wrote in the near future.