Blog O' Matty


Oh how I love my iRobot roomba

This article was posted by Matty on 2008-02-18 20:09:00 -0400 -0400

One of my friends recently purchased an iRobot Roomba, and he let me test it out while he was out of town. I thoroughly tested out his Roomba, and was amazed that it was able to do as good of a job as my existing vacuum cleaner! The Roomba also has a key advantage over my upright vacuum cleaner in that it could wander under couches, beds and dressers to get dust and debree that had made its way there. I was also overjoyed when I found out that once I pushed the CLEAN button on the Roomba, it would begin vacuuming the room with no manual intervention.

These things made me realize that a Roomba was in my future, and I ordered one once my friend returned. My Roomba is now operational, and I have it programmed to vacuum my carpets every other day. The price tag for the unit was a bit high, but the following benefits quickly made me realize that this was the right thing for me:

Having now owned my Roomba for two months, I have only found two downsides. First, the replacement parts (brushes, filters, etc.) are not exactly cheap, and the time it takes to get them is somewhat lengthy. But even when I factor in the upkeep costs, this has to be one of the best purchased I have EVER made! Long live the Roomba (and Chris, you sparked my interest in purchasing a Scooba to mop my floors)!

Measuring the time an application was stopped due to garbage collection

This article was posted by Matty on 2008-02-18 19:37:00 -0400 -0400

I recently spent some of my spare time assisting a friend with debugging some Java performance problems his company was experiencing. When I originally looked into the performance problem several weeks back, I used the mpstat and jstat utilities to observe CPU utilization and object allocations, and based on some jstat anomalies, I used the techniques described in my object allocation post to get a better understanding of how their Java application was allocating objects. After a bit of analysis and a a couple of email exchanges with my friend and one of the developers he worked with, we were able to locate two application problems that the developer has since fixed.

But even with these changes (which resulted in some significant speedups!!), my friend noticed that request times would periodically shoot up to unacceptable levels. After a bit more analysis with jstat, I noticed that the rate of object allocation in the new generation was still relatively high, and started to wonder if the current size of the new generation was limiting throughput. To see if this was the case, I had my friend add the “PrintGCApplicationConcurrentTime” and “PrintGCApplicationStoppedTime” options to the Java command lline:

$ java -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime ...

These options will instruct the Java process to print the time an application is actually running, as well as the time the application was stopped due to GC events. Here are a few sample entries that were produced:

$ egrep '(Application|Total time)' gc.log |more

Application time: 3.3319318 seconds
Total time for which application threads were stopped: 0.7876304
seconds
Application time: 2.1039898 seconds
Total time for which application threads were stopped: 0.4100732
seconds
.....

To get a better idea of how much time the application was running vs. stopped, I created a script (gctime) to summarize the log data. Gctime takes a GC log as input, and prints the total execution time, the time the application ran, the time the application was stopped, as well as the percentage of time the application spent in the running and stopped states. Here is some sample output from a short run:

$ gctimes gc.log

Total execution time : 66.30secs
Time application ran : 55.47secs
Time application was stopped : 10.84secs
% of time application ran : 83.65%
% of time application was stopped : 16.35%

Based on the results above, the fact that objects were “spilling” into the old generation, as well as an observation that the tenuring threshold for most objects were extremely low, it appeared that increasing the size (they were using the default size) of the new generation would help decrease the time the application was paused. I asked my friend to double the size (the size for each Java generation should be chosen carefully based on the results of empirical testing methods) of the “NewSize” and “MaxNewSize” runtime options, and that appears to have fixed their latency problem. As I research the area of Java performance more and more, I am starting to realize that a myriad of factors can lead to poor performance. I hope to share some additional Java performance monitoring tools I have written in future posts.

Locating files on Solaris servers with pkgchk

This article was posted by Matty on 2008-02-18 19:20:00 -0400 -0400

Most Linux and BSD distributions ship with the locate utility, which allows you to quickly find files on a system:

$ locate pvcreate
/usr/sbin/pvcreate /usr/share/man/man8/pvcreate.8.gz

While not quite as thorough as locate, the Solaris pkgchk utility has a “-P” option that provides similar capabilities:

$ pkgchk -l -P metastat | grep Pathname
Pathname: /sbin/metastat Pathname: /usr/sbin/metastat Pathname: /usr/share/man/man1m/metastat.1m

Nice!

Cleaning up HTML files with tidy

This article was posted by Matty on 2008-02-16 16:23:00 -0400 -0400

I have read a number of documents on correctly using CSS and XHTML over the past month, and have learned about a number of common mistakes people make when creating content that uses these technologies. Most of the articles discussed ways to structure web content to avoid these pitfalls, which got me wondering if anyone had taken these recommendations and created a tool to analyze content for errors. After a bit of googling, I came across the W3C content validation site, as well as the tidy utility.

The W3C website is super easy to use, and it provides extremely useful feedback that you can use to improve your content. The tidy utility provides similar capabilities, but has options to actually correct errors it finds in the files it analyzes. Tidy can be downloaded from sourceforge, or installed with your favorite package utility (the CentOS repositories contain tidy, so it’s a yum install way). Once tidy is installed, you can pass the name of one or more files to analyze as arguments:

$ tidy --indent index.html

line 8 column 1 - Warning: <link> isn't allowed in

elements
line 3 column 1 - Info: <html> previously mentioned
line 74 column 28 - Warning: unescaped & which should be written as &
line 74 column 29 - Warning: unescaped & which should be written as &
line 191 column 15 - Warning: discarding unexpected </h2>
line 181 column 9 - Warning: <a> escaping malformed URI reference
Info: Doctype given is "-//W3C//DTD XHTML 1.0 Strict//EN"
Info: Document content looks like XHTML 1.0 Transitional
5 warnings, 0 errors were found!

<HTML FILE CONTENTS WITH FIXES APPLIED>

The tidy output will contain the list of errors it detected as well as the corrected HTML code. This is amazingly cool, and it has tipped me off to a few issues with some of the XHTML files that I am using to support my website. Tidy and the W3C validation site are incredibly useful which will hopefully enhance the experience for individuals who access W3C validated content.

Determining the capabilities of a NIC on a Solaris host

This article was posted by Matty on 2008-02-13 23:45:00 -0400 -0400

There are a myriad of NIC chipsets in use by the major server vendors (Broadcom, Intel, NVidia, etc.), and each chipset typically contains a unique set of capabilities (e.g., hardware offload support, some amount of on board cache devoted to RX / TX rings, hardware flow classification, etc.). To see which capabilities a given NIC chipset supports, you can usually read through the technical white papers and engineering documents that were published when the chipset shipped. To find the NIC chipsets that are in use on a Solaris host, the kstat utility can be run with the the name of a network driver, the instance of the driver, and the “chipid” name:

$ kstat -m bge -i 0 -n chipid | head -10

module: bge instance: 0 name: chipid class: net asic_rev 2416115712 bus_size 64 bit bus_speed fast bus_type PCI-X businfo 4746 cache_line_size 16 chip_type 5715 command 342

This will display a number of pieces of information, including the type of BUS in use, whether it is 32- or 64-bit, and the chipset version. In the output above, we can see that a 64-bit PCI express Broadcom model 5715 adapter is in use by the server. Kstat rocks!