Creating ZFS file systems during the jumpstart process

I use jumpstart at home to update the hosts in my lab as new Nevada builds and Solaris updates are released. As part of the unattended installation / upgrade process, I create a couple of ZFS file systems on each system. Since jumpstart doesn’t have built-in support for creating ZFS file system, I had to add the zpool and zfs commands to my finish script. After a bit of tinkering around, here is what I came up with:

# Locate the first local device on the system
DISK1=`echo quit | /usr/sbin/format 2>/dev/null | /usr/bin/awk '$0 ~ /0./ { print $2 }'`

# Create a ZFS pool using the disk device above
/usr/sbin/zpool create -f -R /a p0 ${DISK1}

# Create ZFS file systems
/usr/sbin/zfs create p0/home
/usr/sbin/zfs set mountpoint=/home p0/home
.....

This appears to work pretty well, and my boxes are now built and operational once the jumpstart process completes. Niiiiiiiice!

Speeding up the initial SMF manifest import

If you are an avid Solaris 10 user, you may have noticed that it takes a bit of time for global and non-global zones to initialize after they are installed. One of the issues that slows down the initialization process is the initial manifest import, which is a series of steps that takes place to populate the repository with one or more service manifests. The enhanced profiles project is currently tasked with permanently addressing this issue, but Steve Peng just putback CR #6351623 (Initial manifest-import is slow) to provide some temporary relief. This is good stuff, and I can’t wait to get my hands on Nevada build 84 to see just how well his tmpfs solution works! If anyone has tested Steve’s solution, or copied over their own repository during system initialization, please leave a comment or shoot me an email. I am curious to see just how speedy these solutions are!!

Oh how I love my iRobot roomba

One of my friends recently purchased an iRobot Roomba, and he let me test it out while he was out of town. I thoroughly tested out his Roomba, and was amazed that it was able to do as good of a job as my existing vacuum cleaner! The Roomba also has a key advantage over my upright vacuum cleaner in that it could wander under couches, beds and dressers to get dust and debree that had made its way there. I was also overjoyed when I found out that once I pushed the CLEAN button on the Roomba, it would begin vacuuming the room with no manual intervention.

These things made me realize that a Roomba was in my future, and I ordered one once my friend returned. My Roomba is now operational, and I have it programmed to vacuum my carpets every other day. The price tag for the unit was a bit high, but the following benefits quickly made me realize that this was the right thing for me:

  • Vacuming an entire room is one button push away
  • Cleaning the Roomba is extremely easy
  • I have seen a dramatic decrease in the amount of dust
  • It can get under couches, beds and dressers than normal vacuums can’t
  • The anti-tangle technology works extremely well

Having now owned my Roomba for two months, I have only found two downsides. First, the replacement parts (brushes, filters, etc.) are not exactly cheap, and the time it takes to get them is somewhat lengthy. But even when I factor in the upkeep costs, this has to be one of the best purchased I have EVER made! Long live the Roomba (and Chris, you sparked my interest in purchasing a Scooba to mop my floors)!

Measuring the time an application was stopped due to garbage collection

I recently spent some of my spare time assisting a friend with debugging some Java performance problems his company was experiencing. When I originally looked into the performance problem several weeks back, I used the mpstat and jstat utilities to observe CPU utilization and object allocations, and based on some jstat anomalies, I used the techniques described in my object allocation post to get a better understanding of how their Java application was allocating objects. After a bit of analysis and a a couple of email exchanges with my friend and one of the developers he worked with, we were able to locate two application problems that the developer has since fixed.

But even with these changes (which resulted in some significant speedups!!), my friend noticed that request times would periodically shoot up to unacceptable levels. After a bit more analysis with jstat, I noticed that the rate of object allocation in the new generation was still relatively high, and started to wonder if the current size of the new generation was limiting throughput. To see if this was the case, I had my friend add the “PrintGCApplicationConcurrentTime” and “PrintGCApplicationStoppedTime” options to the Java command lline:

$ java -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime …

These options will instruct the Java process to print the time an application is actually running, as well as the time the application was stopped due to GC events. Here are a few sample entries that were produced:

$ egrep ‘(Application|Total time)’ gc.log |more
Application time: 3.3319318 seconds
Total time for which application threads were stopped: 0.7876304 seconds
Application time: 2.1039898 seconds
Total time for which application threads were stopped: 0.4100732 seconds
…..

To get a better idea of how much time the application was running vs. stopped, I created a script (gctime) to summarize the log data. Gctime takes a GC log as input, and prints the total execution time, the time the application ran, the time the application was stopped, as well as the percentage of time the application spent in the running and stopped states. Here is some sample output from a short run:

$ gctimes gc.log

Total execution time               : 66.30secs
Time application ran               : 55.47secs
Time application was stopped       : 10.84secs
% of time application ran          : 83.65%
% of time application was stopped  : 16.35%



Based on the results above, the fact that objects were “spilling” into the old generation, as well as an observation that the tenuring threshold for most objects were extremely low, it appeared that increasing the size (they were using the default size) of the new generation would help decrease the time the application was paused. I asked my friend to double the size (the size for each Java generation should be chosen carefully based on the results of empirical testing methods) of the “NewSize” and “MaxNewSize” runtime options, and that appears to have fixed their latency problem. As I research the area of Java performance more and more, I am starting to realize that a myriad of factors can lead to poor performance. I hope to share some additional Java performance monitoring tools I have written in future posts.

Locating files on Solaris servers with pkgchk

Most Linux and BSD distributions ship with the locate utility, which allows you to quickly find files on a system:

$ locate pvcreate
/usr/sbin/pvcreate
/usr/share/man/man8/pvcreate.8.gz

While not quite as thorough as locate, the Solaris pkgchk utility has a “-P” option that provides similar capabilities:

$ pkgchk -l -P metastat | grep Pathname
Pathname: /sbin/metastat
Pathname: /usr/sbin/metastat
Pathname: /usr/share/man/man1m/metastat.1m

Nice!

Cleaning up HTML files with tidy

I have read a number of documents on correctly using CSS and XHTML over the past month, and have learned about a number of common mistakes people make when creating content that uses these technologies. Most of the articles discussed ways to structure web content to avoid these pitfalls, which got me wondering if anyone had taken these recommendations and created a tool to analyze content for errors. After a bit of googling, I came across the W3C content validation site, as well as the tidy utility.

The W3C website is super easy to use, and it provides extremely useful feedback that you can use to improve your content. The tidy utility provides similar capabilities, but has options to actually correct errors it finds in the files it analyzes. Tidy can be downloaded from sourceforge, or installed with your favorite package utility (the CentOS repositories contain tidy, so it’s a yum install way). Once tidy is installed, you can pass the name of one or more files to analyze as arguments:

$ tidy –indent index.html
line 8 column 1 – Warning: <link> isn’t allowed in elements
line 3 column 1 – Info: <html> previously mentioned
line 74 column 28 – Warning: unescaped & which should be written as &
line 74 column 29 – Warning: unescaped & which should be written as &
line 191 column 15 – Warning: discarding unexpected </h2>
line 181 column 9 – Warning: <a> escaping malformed URI reference
Info: Doctype given is “-//W3C//DTD XHTML 1.0 Strict//EN”
Info: Document content looks like XHTML 1.0 Transitional
5 warnings, 0 errors were found!

<HTML FILE CONTENTS WITH FIXES APPLIED>

The tidy output will contain the list of errors it detected as well as the corrected HTML code. This is amazingly cool, and it has tipped me off to a few issues with some of the XHTML files that I am using to support my website. Tidy and the W3C validation site are incredibly useful which will hopefully enhance the experience for individuals who access W3C validated content.