Using the DTrace hotspot provider to observe java object allocations

In my previous post, I discussed how jmap can be used to view the contents of the Java heap. Jmap is a wonderful utility for viewing heap utilization at a specific point in time, but it’s not the best utiltiy for answer questions like “which call stack is allocating objects of type foo?”, or “how many objects are being allocated per second?”, or “what is the average size of allocated objects?”. To answer these types of questions, we need to look to a more thorough instrumentation framework.

One such framework is the DTrace hotspot provider which was introduced in Java 6, and provides a number of probes to dynamically instrument Java applications. These probes allow you to observe thread behavior, garbage collection, method calls, class loader activity, monitor contention and object allocation. The probe to observe object allocation is rightfully named “object-alloc,” and contains several arguments (since the probe arguments are documented extremely well in the Java community, I won’t describe them here) that can be used to understand how the heap is being used.

To use the object-alloc probe, you will need to start the JVM with the “-XX:+DTraceAllocProbes” option, or run the jinfo utility to enable it at runtime (enabling it at runtime is preferred):

$ jinfo -XX:+DTraceAllocProbes `pgrep java`

Once the object-alloc probe is enabled, you can begin prodding your application to see what it’s doing. If you are seeing lots of garbage collection events in jconsole, jstat or your garbage collection logs, the object allocation probe can be used to display object allocations by object type, the number of objects being allocated, and the size of these allocations. The newobjects.d DTrace script provides the necessary probe definitions to display this information:

$ newobjects.d

Class                           Objects created  Bytes Allocated
[I                              112              16064     
Wasteful                        17943            287088    
java/lang/StringBuilder         17943            287088    
java/lang/String                17943            430632    
[C                              107658           52393560  

Class                           Objects created  Bytes Allocated
[I                              121              18520     
Wasteful                        19541            312656    
java/lang/StringBuilder         19541            312656    
java/lang/String                19542            469008    
[C                              117252           57062640  

If you see tons of object allocations, you can use the DTrace quantize() action to create a histogram with object counts or object sizes. The objectsize.d DTrace script shows how to create a distribution that is keyed on object size:

$ objectsize.d

         value  ------------- Distribution ------------- count    
             8 |                                         0        
            16 |@@@@@@@@@@@@@                            95897    
            32 |@@@@                                     31965    
            64 |                                         1        
           128 |@@@@                                     32157    
           256 |@@@@                                     31965    
           512 |@@@@@@@@@                                63932    
          1024 |@@@@                                     31967    
          2048 |                                         0        

Once we know how many objects are being allocated, it would useful to see which Java call stacks are responsible for allocating these objects. To generate a list of Java calls stacks with the number of objects allocated per call stack, we can run the whoallocatecount.d DTrace script:

$ whoallocatecount.d

     < ..... >

     libjvm.so`__1cNSharedRuntimeTdtrace_object_alloc6FpnHoopDesc__i_+0x4f
     
     Wasteful.main([Ljava/lang/String;)V
     StubRoutines (1)
     libjvm.so`_pnGThread__v_+0x1a3
     libjvm.so`jni_CallStaticVoidMethod+0x15d
     java`JavaMain+0xd30
     libc.so.1`_thr_setup+0x52
     libc.so.1`_lwp_start
          5678

In the output above (which was cleaned up to make it easy to view), we can see that 5678 objects where allocated by the main() method in the Wasteful class. If you would prefer to see call stacks by bytes allocated, you can run the whoallocatebybytes.d script:

$ whoallocatebytes.d

     < ..... >

     libjvm.so`__1cNSharedRuntimeTdtrace_object_alloc6FpnHoopDesc__i_+0x4f
     
     Wasteful.main([Ljava/lang/String;)V
     StubRoutines (1)
     libjvm.so`_pnGThread__v_+0x1a3
     libjvm.so`jni_CallStaticVoidMethod+0x15d
     java`JavaMain+0xd30
     libc.so.1`_thr_setup+0x52
     libc.so.1`_lwp_start
          817632

The hotspot provider is an incredible debugging tool, and it’s pretty amazing what you can do with just one probe!! If you decide to use these probes in production, be somewhat cautious. Selectively enable probes, and do so only for the duration of your troubleshooting session. Be safe, be careful, and most of all, have fun! Niiiiiiice!

Getting the Solaris format utility to work with an expanded LUN

A while back I wrote an article titled dynamically growing a Clariion LUN with Solaris. In the article I described how to update the VTOC on a UN that was resized on the storage array. One of my colleagues came to me a few weeks back and told me the procedure was not working, and he couldn’t assign a new size (200GB in this case) to the device in format.

This made me curious, so I started poking around. To see what Solaris thought the actual size of the LUN was, I ran luxadm with the “display” option:

$ luxadm display /dev/rdsk/c5t600601602B9318008CD071B0FF7BDB11d0s2

DEVICE PROPERTIES for disk: /dev/rdsk/c5t600601602B9318008CD071B0FF7BDB11d0s2
  Vendor:               DGC     
  Product ID:           RAID 5          
  Revision:             0322
  Serial Num:           APM000612076
  Unformatted capacity: 204800.000 MBytes
  Read Cache:           Enabled
    Minimum prefetch:   0x0
    Maximum prefetch:   0x0
  Device Type:          Disk device
  Path(s):

  /dev/rdsk/c5t600601602B9318008CD071B0FF7BDB11d0s2
  /devices/scsi_vhci/disk@g600601602b9318008cd071b0ff7bdb11:c,raw
   Controller           /dev/cfg/c3
    Device Address              5006016039a00e91,1
    Host controller port WWN    10000000c9563354
    Class                       secondary
    State                       STANDBY
   Controller           /dev/cfg/c3
    Device Address              5006016939a00e91,1
    Host controller port WWN    10000000c9563354
    Class                       primary
    State                       ONLINE
   Controller           /dev/cfg/c4
    Device Address              5006016139a00e91,1
    Host controller port WWN    10000000c9563355
    Class                       secondary
    State                       STANDBY
   Controller           /dev/cfg/c4
    Device Address              5006016839a00e91,1
    Host controller port WWN    10000000c9563355
    Class                       primary
    State                       ONLINE

On the host we were having issues with, Solaris was able to see all 200GB, but for some reason the format “Auto Configure” option still thought the LUN was 50GB. When I chose the format type “DEFAULT,” the LUN showed up with a capacity of 200GB:

$ format -e

format> type
AVAILABLE DRIVE TYPES:
        0. Auto configure
        1. DEFAULT
        2. DEFAULT
        3. DEFAULT
        4. other
Specify disk type (enter its number)[2]: 1
selecting c5t600601602B9318008CD071B0FF7BDB11d0
[disk formatted]

partition> p
Current partition table (original):
Total disk cylinders available: 6524 + 2 (reserved cylinders)

Part      Tag    Flag     Cylinders         Size            Blocks
  0 unassigned    wm       1 - 26104      199.97GB    (26104/0/0) 419360760
  1 unassigned    wm       0                0         (0/0/0)             0
  2     backup    wu       0 - 26104      199.97GB    (26105/0/0) 419376825
  3 unassigned    wm       0                0         (0/0/0)             0
  4 unassigned    wm       0                0         (0/0/0)             0
  5 unassigned    wm       0                0         (0/0/0)             0
  6 unassigned    wm       0                0         (0/0/0)             0
  7 unassigned    wm       0                0         (0/0/0)             0
  8       boot    wu       0 -     0        7.84MB    (1/0/0)         16065
  9 unassigned    wm       0                0         (0/0/0)             0

After chatting with some qualified Sun folks, this madness started to make sense. When you select “Auto Configure,” format will choose an entry from the format.dat file instead of determing the drive’s capacity from the SCSI READ CAPACITY command. The DEFAULT option causes this command to be sent, and therefore the new capacity is available. For future reference, running format with the “-e” option and choosing “DEFAULT” seems to work.

Locating the device that contains an EXT3 label

On most Linux hosts, the first field of the /etc/fstab file contains labels instead of disk partions. This simplifies file system management, since you don’t have to update the fstab file if you move a drive to a new controller, or add additional drives to a system. If you want to locate the partition that is associated with a label, you can use the findfs utility:

$ /sbin/findfs LABEL=/
/dev/hda1

You can also use the findfs utility to locate a partition by UUID

$ /sbin/findfs UUID=b4ce6d24-000c-45a3-8258-cbf9f826c0ce
/dev/hda1

The findfs utility is extremely useful, and is just one of a number of cool programs (others include blkid, e2label, partinfo and findsuper) in the e2fsprogs package! Giddie up!

Viewing the contents of an ISO image from the command line

I had several ISO images lying around on my desktop, and decided to go through each image and purge the ones I no longer need. Several image files weren’t labeled, so I decided to fire up the isoinfo utility to figure out what the mystery ISO images were. The first image I checked was a Solaris 10 update 4 DVD, as you can see from the “Volume id” field in the isoinfo “-d” (print primary volume descriptor) output:

$ isoinfo -d -i 1.iso |more

CD-ROM is in ISO 9660 format
System id: Solaris
Volume id: SOL_10_807_X86
Volume set id: 
Publisher id: 
Data preparer id: 
Application id: MKISOFS ISO 9660/HFS FILESYSTEM BUILDER & CDRECORD CD-R/DVD CRE
ATOR (C) 1993 E.YOUNGDALE (C) 1997 J.PEARSON/J.SCHILLING
Copyright File id: 
Abstract File id: 
Bibliographic File id: 
Volume set size is: 1
Volume set sequence number is: 1
Logical block size is: 2048
Volume size is: 1327235
El Torito VD version 1 found, boot catalog is in sector 42171
NO Joliet present
Rock Ridge signatures version 1 found
Eltorito validation header:
    Hid 1
    Arch 0 (x86)
    ID ''
    Key 55 AA
    Eltorito defaultboot header:
        Bootid 88 (bootable)
        Boot media 0 (No Emulation Boot)
        Load segment 0
        Sys type 0
        Nsect 4
        Bootoff A4BC 42172

To double check that the volume descriptor was correct, I also ran isoinfo with the “-l” (list files) option to generate a file listing of each ISO image:

$ isoinfo -l -i 1.iso |more

Directory listing of /
d---------   0    0    0            4096 Aug 16 2007 [    632 02]  . 
d---------   0    0    0            4096 Aug 16 2007 [    632 02]  .. 
----------   0    0    0            2048 Aug 16 2007 [  42171 00]  .CATALOG 
----------   0    0    0              92 Jun 12 2007 [  42231 00]  .CDTOC 
d---------   0    0    0            2048 Aug 16 2007 [  42045 02]  .INSTALL 
----------   0    0    0               0 Aug 16 2007 [  42173 00]  .INSTALL_CONFIG 
----------   0    0    0             419 Jun 12 2007 [  42232 00]  .SLICEMAPFILE 
d---------   0    0    0            2048 Aug 16 2007 [  42041 02]  BOOT 
----------   0    0    0            6128 Jun 12 2007 [  42233 00]  COPYRIGHT 
----------   0    0    0             257 Aug 16 2007 [  42236 00]  INSTALLER 
----------   0    0    0          459760 Jun 12 2007 [  42237 00]  JDS-THIRDPARTYLICENSEREADME 
d---------   0    0    0            2048 Aug 16 2007 [    634 02]  LICENSE 
d---------   0    0    0            2048 Aug 16 2007 [    635 02]  SOLARIS_10 
----------   0    0    0              19 Jun 12 2007 [  42462 00]  _VOLUME.INF 
----------   0    0    0              21 Aug 16 2007 [  42463 00]  _VOLUME_INF.2 
----------   0    0    0              21 Aug 16 2007 [  42464 00]  _VOLUME_INF.3 
----------   0    0    0              21 Aug 16 2007 [  42465 00]  _VOLUME_INF.4 
----------   0    0    0              21 Aug 16 2007 [  42466 00]  _VOLUME_INF.5 

With this information in hand, I added useful labels to the mystery images. Nice!

Concert review: Alice In Chains acoustic hour

Roughly ten years ago one of my friends loaned me his copy of the Alice In Chain’s MTV unplugged CD. Since that time, I’ve listened to that CD thousands of times, and have become one of the biggest Alice In Chains fans on the planet. I have had the opportunity to see Jerry Cantrell play a number of shows across the country, and attended an Alice In Chains show earlier this year. During that show, the guys from AIC played an awesome acoustic set, which left me wanting more. A few months back I got notified that AIC was doing a number of acoustic shows across the country, so I decided to head down to New Orleans to check one out.

New Orleans is an incredible town, so I decided to wander down a couple of days early to relax and have a little fun. While relaxing on the top of the Omni hotel the night before the show, I looked over at the bar and saw Mike Inez (the Alice In Chain’s bassist) drinking it up with his friends. I was completely taken by this, and couldn’t believe that a member of one of my favorite bands was sitting twenty feet away from me. When Mike got up to leave, I stopped him for a second to tell him we ventured down to see him, and thanked him for putting on the acoustic show (I totally respect peoples privacy, so I kept the conversation to two sentences). He smiled, thanked us for coming, and then headed out to have some fun with his friends.

On the day of the show, I was still flying high from meeting one of the band members (I saw Jerry Cantrell in the hotel bar, but he was less than friendly for some reason), so I was hoping the acoustic show would elevate things to the next level. The band took the stage around 9:30pm, and opened the show with the song “Don’t Follow.” The sound system at the House of Blues really made the acoustic instruments stand out, and the opening song sounded absolutely incredible! Jerry, William, Mike and Sean followed up the opening song with a slew of classic AIC songs, including “Heaven Beside You,” “Brother,” “Nutshell,” “Killer is Me,” “Sludge Factory,” “Love Hate Love,” “No Excuses,” “Down in a Hole,” “Angry Chair,” “Got me Wrong,” “Would?” and “Rooster.” This show has to go down as one of the best concerts I have ever attended, and hearing these guys play acoustic made my year! Viva la AIC!!!

Summarizing Java heap utilization with jmap

Java has become one of the most successful languages to hit the IT industry. One of the reasons behind it’s high rate of adoption is that fact that Java manages memory resources for the programmer. This makes programming significantly easier, but introduces additional complexity, since engineers need to size the Java heap and pick a garbage collection algorithm (or in the case of generational collectors, more than one) that best matches an application’s workload. Fortunately, two extremely useful tools allow you to observe the Java heap: the DTrace hotspot provider and the Java jmap utility. In this blog post, I will discuss the jmap utility.

The Java jmap utility provides a number of useful options to summarize heap usage, and get a break down of objects in the new and old generations. To summarize the new and old generations, the jmap utility can be run with the “-heap” option, and the PID of the JVM to summarize (the PID can be acquired by running the jps utility, or by reviewing the ps output):

$ jmap -heap `pgrep java`

Attaching to process ID 5365, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 1.6.0_01-b06

using parallel threads in the new generation.
using thread-local object allocation.
Concurrent Mark-Sweep GC

Heap Configuration:
   MinHeapFreeRatio = 40
   MaxHeapFreeRatio = 70
   MaxHeapSize      = 1073741824 (1024.0MB)
   NewSize          = 268435456 (256.0MB)
   MaxNewSize       = 268435456 (256.0MB)
   OldSize          = 805306368 (768.0MB)
   NewRatio         = 7
   SurvivorRatio    = 6
   PermSize         = 21757952 (20.75MB)
   MaxPermSize      = 88080384 (84.0MB)

Heap Usage:
New Generation (Eden + 1 Survivor Space):
   capacity = 234881024 (224.0MB)
   used     = 102062424 (97.33431243896484MB)
   free     = 132818600 (126.66568756103516MB)
   43.452818053109304% used
Eden Space:
   capacity = 201326592 (192.0MB)
   used     = 94318424 (89.94905853271484MB)
   free     = 107008168 (102.05094146728516MB)
   46.84846798578898% used
From Space:
   capacity = 33554432 (32.0MB)
   used     = 7744000 (7.38525390625MB)
   free     = 25810432 (24.61474609375MB)
   23.07891845703125% used
To Space:
   capacity = 33554432 (32.0MB)
   used     = 0 (0.0MB)
   free     = 33554432 (32.0MB)
   0.0% used
concurrent mark-sweep generation:
   capacity = 805306368 (768.0MB)
   used     = 15032688 (14.336288452148438MB)
   free     = 790273680 (753.6637115478516MB)
   1.8667042255401611% used
Perm Generation:
   capacity = 50577408 (48.234375MB)
   used     = 30285240 (28.88225555419922MB)
   free     = 20292168 (19.35211944580078MB)
   59.87898786746842% used

The jmap output contains the size of eden (the place where new objects are created), each survivor space, the old generation and the permanent generation. If you aren’t familiar with the generations that are used in modern JVMs, you might want to take a look at Tony Printezis’ article on this topic.

In addition to printing summary data, jmap can also be used to display the objects in the heap.

$ jmap -histo `pgrep java` |more

num   #instances    #bytes  class name
--------------------------------------
  1:    224437    27673848  [C
  2:     38611    23115312  [B
  3:     47801    12187536  [I
  4:    208624     8344960  java.lang.String
  5:     45332     6192904  <constMethodKlass>
  6:     45332     5450864  <methodKlass>
  7:      3889     4615536  <constantPoolKlass>
  8:     45671     4193136  [Ljava.lang.Object;
  9:     66203     3222312  <symbolKlass>
 10:      3889     3192264  <instanceKlassKlass>
 11:      3455     2999296  <constantPoolCacheKlass>
 12:     19754     1106224  java.nio.HeapCharBuffer

< ..... >

This summary contains the type of each object (e.g., “[C” would refer to arrays of characters), the number of objects of each type as well as the total number of bytes these objects take up in the heap. Jmap additionally allows you to filter the output to display only the live objects:

$ jmap -histo:live 698

num   #instances    #bytes  class name
--------------------------------------
  1:     45256     6185464  
  2:     45256     5441744  
  3:      6899     5201176  [B
  4:      3851     4577776  
  5:     29133     4259592  [C
  6:     66069     3213448  
  7:      3851     3169464  
  8:      3417     2970112  
  9:     29714     1188560  java.lang.String
 10:      1553      745224  
 11:      4161      732336  java.lang.Class

< ..... >

There are also jmap options to dump the heap to a file, display the contents of the permanent generation and view obejcts that are awaiting finalization. Jmap is an incredibly useful tool, and in my next blog post I will show how the DTrace hotspot provider can be used to dig into object allocation in much more detail.