There are a myriad of NIC chipsets in use by the major server vendors (Broadcom, Intel, NVidia, etc.), and each chipset typically contains a unique set of capabilities (e.g., hardware offload support, some amount of on board cache devoted to RX / TX rings, hardware flow classification, etc.). To see which capabilities a given NIC chipset supports, you can usually read through the technical white papers and engineering documents that were published when the chipset shipped. To find the NIC chipsets that are in use on a Solaris host, the kstat utility can be run with the the name of a network driver, the instance of the driver, and the “chipid” name:

$ kstat -m bge -i 0 -n chipid | head -10

module: bge                             instance: 0
name:   chipid                          class:    net
        asic_rev                        2416115712
        bus_size                        64 bit
        bus_speed                       fast
        bus_type                        PCI-X
        businfo                         4746
        cache_line_size                 16
        chip_type                       5715
        command                         342

This will display a number of pieces of information, including the type of BUS in use, whether it is 32- or 64-bit, and the chipset version. In the output above, we can see that a 64-bit PCI express Broadcom model 5715 adapter is in use by the server. Kstat rocks!

Posted by matty, filed under Solaris Networking. Date: February 13, 2008, 11:45 pm | No Comments »

A number of opensolaris communities have asked their members for feedback, and the list of technologies they would like to see added in the future. The storage community received a ton of feedback when they asked the community for the list of features they would like to have added to opensolaris, and this feedback was recently posted to the genunix wiki. If there are features you are interested in that don’t appear on the list, I would highly recommend adding them. There are so many cool things underway (Comstar, CIFS client and server, NPIV support, pNFS, better remote replication, etc.) in the storage community, and it’s awesome to see the storage project leaders reaching out to the community to get their ideas!

Posted by matty, filed under Solaris Storage. Date: February 13, 2008, 12:16 am | No Comments »

While doing a bit of research tonight I came across a reference to mod_substitute. This nifty module allows you to substitute text in the HTTP request body, which provides an easy way to do things similar to the following:

<Location /private>
    AddOutputFilterByType SUBSTITUTE text/html
    Substitute s/SECRET/XXXXX/ni
</Location>

I digs me some Apache!

Posted by matty, filed under Apache. Date: February 13, 2008, 12:05 am | No Comments »

I gave a presentation last night on debugging Java performance problems. I got a couple of requests to post links to the performance analysis tools I discussed, so here you go:

DTraceToolkit

Garbage collection log visualization utility

Garbage collection visualization utility

Java hotspot DTrace provider

Java heap profiling agent

Monitoring garbage collection with jstat

Observing object allocation with DTrace

Trending Java performance

I would like to thank everyone for attending, and hope to see ya’ll at a future meeting!

Posted by matty, filed under Articles, Presentations and Certifications. Date: February 5, 2008, 10:33 pm | No Comments »

While reading through some old notes this weekend, I came across a page I created eons ago about managing Solaris packages. If you want to find out the file modes, the user and group ownership and the package a file belongs to, you can run the pkgchk utility with the “-l” and “-p” options and the name of a file to check:

$ pkgchk -l -p /usr/sfw/bin/snmpget

Pathname: /usr/sfw/bin/snmpget
Type: regular file
Expected mode: 0755
Expected owner: root
Expected group: bin
Expected file size (bytes): 20372
Expected sum(1) of contents: 48443
Expected last modification: Sep 20 17:41:36 2007
Referenced by the following packages:
        SUNWsmcmd
Current status: installed

Pkgchk is a nifty utility!

Posted by matty, filed under Solaris Patching. Date: February 4, 2008, 12:06 am | 3 Comments »

The Java SDK comes with a number of tools and JVM options that can be used to analyze the performance of the Java runtime. One extremely useful tool is the heap profiler agent, which provides facilities to profile memory usage, CPU utilization and lock contention. To load the profiler agent to profile CPU utilization, you can add the “-agentlib:hprof=cpu=times” option to your java command line:

$ java -Xms256m -Xmx256m -verbose:gc -agentlib:hprof=cpu=times App

Once loaded, the agent will use byte code injection (BCI) to instrument each method’s entry and return points. This allows the agent to measure the number of times each method was called, the time spent in each method, and the call chain that led to the method being invoked. To utilize the agent to it’s full potential, you will need to exercise the application while the agent is active. Once the runtime has been exercised, you can hit cntrl+C to stop the process. This will cause the agent to write the data it has collected to the file java.hprof.txt, which can be viewed with your favorite pager or editor:

$ more java.hprof.txt

CPU TIME (ms) BEGIN (total = 40712194) Mon Jan 21 19:23:12 2008
rank   self  accum   count trace method
   1 38.52% 38.52% 1036273 301143 java.util.Random.next
   2 19.57% 58.09%  518136 301144 java.util.Random.nextDouble
   3 11.87% 69.96%  518136 301145 java.lang.Math.random
   4 11.05% 81.01% 1036274 301141 java.util.concurrent.atomic.AtomicLong.get
   5 10.53% 91.54% 1036273 301142 java.util.concurrent.atomic.AtomicLong.compareAndSet
   6  8.14% 99.68%  259068 301146 TestMain.foo
   7  0.05% 99.73%       1 300969 java.security.Permissions.add
   8  0.04% 99.77%       1 301106 java.lang.Class.privateGetDeclaredFields
   9  0.03% 99.79%       1 301138 java.util.Random.
  10  0.02% 99.81%       2 300908 java.io.FilePermission$1.run
  11  0.01% 99.82%       1 301283 java.lang.ThreadGroup.remove
  12  0.01% 99.83%       1 300820 sun.net.www.protocol.file.Handler.createFileURLConnection
……



The java.hprof.txt file contains the number of times each method was invoked, as well as the amount of CPU time (as a percentage) that was spent in each method. To see how a method was called, you can search the profiler output for the trace identifier that is listed along side the profiling data. This will produce a Java stack trace similar to the following that shows how a given method was invoked:

TRACE 301143:
        java.util.Random.next(Random.java:Unknown line)
        java.util.Random.nextDouble(Random.java:Unknown line)
        java.lang.Math.random(Math.java:Unknown line)
        TestMain.foo(TestMain.java:Unknown line)



The BCI approach introduces a fair amount of overhead to the Java runtime. In cases were the overhead is hindering testing, you can use the agent’s “cpu=samples” option instead. This will cause the agent to sample the runtime environment at periodic intervals to see which methods are executing. While this approach is not as accurate as the BCI approach, it provides a good set of results with less runtime overhead. The java profiling agent is incredibly useful, and just one of the vast number of tools that are available to profile Java applications.

Posted by matty, filed under Java. Date: February 2, 2008, 3:05 pm | No Comments »

When bugs occur in the Java runtime environment, most administrators want to get notified so they can take corrective action. These actions can range from restarting a Java process, collecting postmortem data or calling in application support personnel to debug the situation further. The Java runtime has a number of useful options that can be used for this purpose. The first option is “-XX:OnOutOfMemoryError”, which allows a command to be run when the runtime environment incurs an out of memory condition. When this option is combined with the logger command line utility:

$ java -XX:OnOutOfMemoryError=”logger Java process %p encountered an OOM condition” …

Syslog entries similar to the following will be generated each time an OOM event occurs:

Jan 21 19:59:17 nevadadev root: [ID 702911 daemon.notice] Java process 19001 encountered an OOM condition

Another super useful option is “-XX:OnError”, which allows a command to be run when the runtime environment incurs a fatal error (i.e., a hard crash). When this option is combined with the logger utility:

$ java -XX:OnError=”logger -p Java process %p encountered a fatal condition” …

Syslog entries similar to the following will be generated when a fatal event occurs:

Jan 21 19:52:17 nevadadev root: [ID 702911 daemon.notice] Java process 19004 encountered a fatal condition

The options above allow you to run one or more commands when these errors are encountered, so you could chain together a postmortem debugging tool, a utility (logger or mail) to generate alerts, and a restarter script to start a new Java process (this assumes you aren’t using SMF). Nice!

Posted by matty, filed under Java. Date: January 29, 2008, 10:52 pm | 2 Comments »

Java memory management revolves around the garbage collector, which is the entity responsible for traversing the heap and freeing space that is being taken up by unreferenced objects. Garbage collection makes life easier for Java programmers, since it frees them from having to explicitly manage memory resources (this isn’t 100% true, but close enough). In the Java runtime environment, there are two types of collections that can occur. The first type of collection is referred to as minor collection. Minor collections are responsible for locating live objects in the young generation (eden), copying these objects to the inactive survivor space, and moving tenured objects from the active survivor space to the old (tenured) generation (this assumes that a generational collector is being used). The second form of collection is the major collection. This type of collection frees unreferenced objects in in the tenured generation, and optionally compacts the heap to reduce fragmentation.

When debugging performance problems, it is extremely useful to be able to monitor object allocations and frees in the new and old generations. The Java development kit comes with the jstat utility, which provides a ton of visibility into what the garbage collector is doing, as well as a slew of information on how each generation is being utilized. To use jstat to display garbage collection statistics for the new, old and permanent generations, jstat can be invoked with the “-gc” (print garbage collection heap statistics) option, the “-t” (print the total number of seconds the JVM has been up) option, the process id to retrieve statistics from, and an optional interval to control how often statistics are printed:

$ jstat -gc -t `pgrep java` 5000

Timestamp        S0C    S1C    S0U    S1U      EC       EU        OC         OU       PC     PU    YGC     YGCT    FGC    FGCT     GCT
        98772.0 1600.0 1600.0  0.0   1599.8 13184.0   5561.6   245760.0   201671.9  16384.0 6443.0 166683 2402.690 32411  110.564 2513.255
        98777.0 1600.0 1600.0 1599.4  0.0   13184.0   9533.7   245760.0   156797.1  16384.0 6443.0 166690 2402.785 32414  110.573 2513.359
        98782.0 1600.0 1600.0 1599.7  0.0   13184.0  10328.6   245760.0   166402.2  16384.0 6443.0 166698 2402.889 32416  110.580 2513.469
        98787.0 1600.0 1600.0  0.0   1599.9 13184.0   2383.5   245760.0   195366.0  16384.0 6443.0 166707 2403.016 32416  110.580 2513.595


The output above contains the size of each survivor space (S0C && S1C), the utilization of each survivor space (S0U && S1U), the capacity of eden (EC), the utilization of eden (EU), the capacity of the old generation (OC), the utilization of the old generation (OU), the permanent generation capacity (PC), the permanent generation utilization (PU), the total number of young generation garbage collection events (YGC), the total amount of time spent collecting objects in the new generation (YGCT), the total number of old generation garbage collection events that have occurred (FGC), the total amount of time spent collecting objects in the old generation (FGCT), and the total time spent performing garbage collection.

If you prefer to view garbage collection events as percentages, you can use the “-gcutil” option:

$ jstat -gcutil -t -h5 `pgrep java` 5000

Timestamp         S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT
        99814.1   0.00  99.99  18.08  63.77  39.32 168551 2427.512 32800  111.800 2539.313
        99819.1  99.96   0.00  66.29  78.18  39.32 168562 2427.649 32800  111.800 2539.449
        99824.1 100.00   0.00  94.40  62.46  39.32 168572 2427.795 32803  111.815 2539.610
        99829.2 100.00   0.00  60.25  65.08  39.32 168580 2427.888 32806  111.824 2539.713



The output above contains the utilization of each survivor space as a percentage of the total survivor space capacity (S0 && S1), the utilization of eden as a percentage of the total eden capacity (E), the utilization of the tenured generation as a percentage of the total tenured generation capacity (O), the utilization of the permanent generation as a percentage of the total permanent generation capacity (P), the total number of young generation garbage collection events (YGC), the total time spent collection objects in the young generation (YGCT), the total number of of old generation garbage collection events (FGC), the total amount of time spent collecting objects in the old generation (FGCT), and the total garbage collection time.

To get the time spent in garbage collection along with the reason the collection occurred, jstat can be run with the “-gccause” option:

$ jstat -gccause -t `pgrep java` 1000

Timestamp         S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT    LGCC                 GCC
       100157.3  99.96   0.00  66.27  63.82  39.32 169160 2435.394 32925  112.202 2547.595 CMS Initial Mark     No GC
       100158.3   0.00  99.99  32.14  67.72  39.32 169163 2435.430 32925  112.202 2547.631 unknown GCCause      No GC
       100159.3   0.00  99.97  50.22  65.10  39.32 169165 2435.454 32927  112.208 2547.662 CMS Initial Mark     No GC
       100160.3  99.97   0.00   6.02  62.46  39.32 169168 2435.493 32928  112.211 2547.704 unknown GCCause      No GC
       100161.3  99.97   0.00  32.14  62.46  39.32 169168 2435.493 32928  112.211 2547.704 unknown GCCause      No GC



There are also options to print class loader activity and hotspot compiler statistics, and to break down utilization by generation (this is extremely useful when your trying to profile a specific memory pool). There are a number of incredibly useful opensource tools for visualizing garbage collection data, and I hope to talk about these in the near future.

Posted by matty, filed under Java. Date: January 16, 2008, 9:18 pm | 1 Comment »

« Previous Entries Next Entries »