Instrumenting Java code with JSDT probes

A few months back, I got wind from the world famous Jarod Jenson that Keith McGuigan was working on adding JSDT probes to Java. JSDT probes struck me as an extremely useful feature for realtime problem analysis, so I decided to instrument some sample code to see how they worked. After a couple of e-mail exchanges with Keith, I had a working Java build that supported JSDT. After reviewing a number of Java projects, I chose to instrument the PostgreSQL Java type IV driver, since I wanted to learn PostgreSQL along with JDBC. After reading through a fair amount of the PostgreSQL type IV driver code and Keith’s blog, I decided to add the following probes to the postgreSQL driver:

$ dtrace -l -m java_tracing

   ID   PROVIDER            MODULE                          FUNCTION NAME
13199 postgresqljdbc1619      java_tracing                       unspecified transactioncommit
13200 postgresqljdbc1619      java_tracing                       unspecified transactionrollback
13201 postgresqljdbc1619      java_tracing                       unspecified executesqlstatementstart
13202 postgresqljdbc1619      java_tracing                       unspecified getconnectionfrompoolstop
13203 postgresqljdbc1619      java_tracing                       unspecified executepreparessqlstatementstart
13204 postgresqljdbc1619      java_tracing                       unspecified transactionautomcommit
13205 postgresqljdbc1619      java_tracing                       unspecified releaseconnectionstop
13206 postgresqljdbc1619      java_tracing                       unspecified getconnectionfrompoolstart
13207 postgresqljdbc1619      java_tracing                       unspecified executepreparedsqlstatementstop
13208 postgresqljdbc1619      java_tracing                       unspecified returnconnectiontopoolstop
13209 postgresqljdbc1619      java_tracing                       unspecified transactionsavepoint
13210 postgresqljdbc1619      java_tracing                       unspecified acquireconnectionstart
13211 postgresqljdbc1619      java_tracing                       unspecified returnconnectiontopoolstart
13212 postgresqljdbc1619      java_tracing                       unspecified releaseconnectionstart
13213 postgresqljdbc1619      java_tracing                       unspecified acquireconnectionstop
13214 postgresqljdbc1619      java_tracing                       unspecified executesqlstatementstop



DTrace uses providers to group probes, so the first thing I did was create a “postgresqljdbc ” provider that would be visible to DTrace through the JSDT framework. This was achieved by defining an interface that extended the com.sun.tracing.Provider class:

package org.postgresql;

public interface postgresqljdbc extends com.sun.tracing.Provider {

    // Probes that fire when a simple SQL statement begins / ends execution
    void executesqlstatementstart(String query);
    void executesqlstatementstop(String query);

    // Probes that fire when a prepared statement begins / ends execution
    void executepreparessqlstatementstart(String query);
    void executepreparedsqlstatementstop(String query);

    // Probe that fires when a new connection is established / released
    void acquireconnectionstart(String host, int port, String database, String user);
    void acquireconnectionstop(String host, int port, String database, String user);

    void releaseconnectionstart(String host, int port, String user, String passwd);
    void releaseconnectionstop(String host, int port, String user, String passwd);

    // Probes that fire when a DB connection pool is accessed / released
    void getconnectionfrompoolstart(String host, int port, String user, String passwd);
    void getconnectionfrompoolstop(String host, int port, String user, String passwd);

    void returnconnectiontopoolstart(String host, int port, String user, String passwd);
    void returnconnectiontopoolstop(String host, int port, String user, String passwd);

    // Probe that fires when a transaction starts, saves, commits or rolls back
    void transactionautomcommit(boolean value);
    void transactionsavepoint(String value);
    void transactioncommit();
    void transactionrollback();
}



Now that the provider was defined, I needed to import the Provider and ProviderFactory classes that are part of the JSDT framework, and add the code to instantiate a new postgresqljdbc object:

import com.sun.tracing.Provider;
import com.sun.tracing.ProviderFactory;

public class Driver implements java.sql.Driver
{
   .....

    public static ProviderFactory factory = ProviderFactory.getDefaultFactory();
    public static postgresqljdbc provider = factory.createProvider(postgresqljdbc.class);



Next, I started adding probe points to the driver. This was super easy, since probe points are defined by a call to one of the provider methods that are defined in the interface that extends the Provider class. Here are the probes I added to capture SQL statement execution:

    public boolean executeWithFlags(String p_sql, int flags) throws SQLException
    {
        org.postgresql.Driver.provider.executesqlstatementstart(p_sql);
        checkClosed();
        p_sql = replaceProcessing(p_sql);
        Query simpleQuery = connection.getQueryExecutor().createSimpleQuery(p_sql);
        execute(simpleQuery, null, QueryExecutor.QUERY_ONESHOT | flags);
        this.lastSimpleQuery = simpleQuery;

        org.postgresql.Driver.provider.executesqlstatementstop(p_sql);

        return (result != null && result.getResultSet() != null);
    }



In this specific code block, I added the “org.postgresql.Driver.provider.executesqlstatementstart” probe which fires when the SQL statement begins execution, and a “executesqlstatementstop” probe that fires after the SQL statement has executed. In both cases, the argument available to DTrace scripts is the SQL statement (represented as a String) that is being executed.

Next up, I ran ant to build a new postgreSQL driver, and executed a test program to verify everything worked:

$ export CLASSPATH=/home/matty/java/postgrestest:/home/matty/java/postgrestest/postgresql.jar

$ export JAVA_HOME=/usr/java/

$ java -Djdbc.drivers=org.postgresql.Driver -XX:+ExtendedDTraceProbes Test

The Test program does nothing more than create SQL connections, execute SQL statements and print the results to STDOUT. To allow me to see the difference between prepared and unprepared statements, I decided to add unique probes for each statement type.

After debugging a few issues, I was able to run the Test program and a few DTrace scripts I had created. The first script, sqlconnections.d, listed new connections to the database and the time it took to create these connections:

$ ./sqlconnections.d

TIMESTAMP             HOST                  PORT        DATABASE    USER        TIME      
2008 Oct 10 15:59:47  localhost             5432        postgres    matty       17048632  
2008 Oct 10 15:59:51  localhost             5432        postgres    matty       17597898  



The next script, sqltimes.d, listed all of the SQL statements that were executed along with the total time they had run:

$ sqltimes.d

Tracing started (CNTRL+C to end)
^C
SQL statements:

SQL QUERY                                                          COUNT        TIME
update matty set col1='one'                                            2      722359
select * from matty                                                    4     1456311


Prepared statements:

SQL QUERY                                                          COUNT        TIME
update matty set col1='one'                                            2      548765
select * from matty                                                    4     1345987



The third script, sqltrace.d, displayed each SQL statement executed along with the time it had executed:

$ ./sqltrace.d

TIMESTAMP             SQL QUERY                                                     TIME      
1970 Mar  1 16:59:06  select * from matty                                           314021    
1970 Mar  1 16:59:06  select * from matty                                           251793    
1970 Mar  1 16:59:06  update matty set col1='one'                                   308901    



I also had fun playing with transactions and JDBC connection pool probes, but that data wasn’t nearly as interesting as the SQL statement execution times listed above. If you are running openjdk on Solaris, and you want to get better visibility into your applications, JSDT may well be worth a look! I’m still not 100% sure what kind of performance impact these probes will have on an application, but will wait for the probes to be integrated into a mainsteam Java build prior to doing any real performance testing. Thanks Keith for answering my e-emails, and to team DTrace for creating the awesomeness that is DTrace!

Debugging Java performance problems presentation

I recently gave a presentation at the local UNIX users group titled Debugging Java performance
problems
. The presentation describes various opensource tools and how they can be used to understand what is causing CPU, memory and lock contention issues inside a Java virtual machine. If there are additional tools not discussed in the presentation that you find useful for debugging Java performance problems, please let me know through the comment feature.

Measuring the time an application was stopped due to garbage collection

I recently spent some of my spare time assisting a friend with debugging some Java performance problems his company was experiencing. When I originally looked into the performance problem several weeks back, I used the mpstat and jstat utilities to observe CPU utilization and object allocations, and based on some jstat anomalies, I used the techniques described in my object allocation post to get a better understanding of how their Java application was allocating objects. After a bit of analysis and a a couple of email exchanges with my friend and one of the developers he worked with, we were able to locate two application problems that the developer has since fixed.

But even with these changes (which resulted in some significant speedups!!), my friend noticed that request times would periodically shoot up to unacceptable levels. After a bit more analysis with jstat, I noticed that the rate of object allocation in the new generation was still relatively high, and started to wonder if the current size of the new generation was limiting throughput. To see if this was the case, I had my friend add the “PrintGCApplicationConcurrentTime” and “PrintGCApplicationStoppedTime” options to the Java command lline:

$ java -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime …

These options will instruct the Java process to print the time an application is actually running, as well as the time the application was stopped due to GC events. Here are a few sample entries that were produced:

$ egrep ‘(Application|Total time)’ gc.log |more
Application time: 3.3319318 seconds
Total time for which application threads were stopped: 0.7876304 seconds
Application time: 2.1039898 seconds
Total time for which application threads were stopped: 0.4100732 seconds
…..

To get a better idea of how much time the application was running vs. stopped, I created a script (gctime) to summarize the log data. Gctime takes a GC log as input, and prints the total execution time, the time the application ran, the time the application was stopped, as well as the percentage of time the application spent in the running and stopped states. Here is some sample output from a short run:

$ gctimes gc.log

Total execution time               : 66.30secs
Time application ran               : 55.47secs
Time application was stopped       : 10.84secs
% of time application ran          : 83.65%
% of time application was stopped  : 16.35%



Based on the results above, the fact that objects were “spilling” into the old generation, as well as an observation that the tenuring threshold for most objects were extremely low, it appeared that increasing the size (they were using the default size) of the new generation would help decrease the time the application was paused. I asked my friend to double the size (the size for each Java generation should be chosen carefully based on the results of empirical testing methods) of the “NewSize” and “MaxNewSize” runtime options, and that appears to have fixed their latency problem. As I research the area of Java performance more and more, I am starting to realize that a myriad of factors can lead to poor performance. I hope to share some additional Java performance monitoring tools I have written in future posts.

Profiling Java methods with the heap profiling agent

The Java SDK comes with a number of tools and JVM options that can be used to analyze the performance of the Java runtime. One extremely useful tool is the heap profiler agent, which provides facilities to profile memory usage, CPU utilization and lock contention. To load the profiler agent to profile CPU utilization, you can add the “-agentlib:hprof=cpu=times” option to your java command line:

$ java -Xms256m -Xmx256m -verbose:gc -agentlib:hprof=cpu=times App

Once loaded, the agent will use byte code injection (BCI) to instrument each method’s entry and return points. This allows the agent to measure the number of times each method was called, the time spent in each method, and the call chain that led to the method being invoked. To utilize the agent to it’s full potential, you will need to exercise the application while the agent is active. Once the runtime has been exercised, you can hit cntrl+C to stop the process. This will cause the agent to write the data it has collected to the file java.hprof.txt, which can be viewed with your favorite pager or editor:

$ more java.hprof.txt

CPU TIME (ms) BEGIN (total = 40712194) Mon Jan 21 19:23:12 2008
rank   self  accum   count trace method   
   1 38.52% 38.52% 1036273 301143 java.util.Random.next   
   2 19.57% 58.09%  518136 301144 java.util.Random.nextDouble   
   3 11.87% 69.96%  518136 301145 java.lang.Math.random
   4 11.05% 81.01% 1036274 301141 java.util.concurrent.atomic.AtomicLong.get
   5 10.53% 91.54% 1036273 301142 java.util.concurrent.atomic.AtomicLong.compareAndSet
   6  8.14% 99.68%  259068 301146 TestMain.foo
   7  0.05% 99.73%       1 300969 java.security.Permissions.add
   8  0.04% 99.77%       1 301106 java.lang.Class.privateGetDeclaredFields
   9  0.03% 99.79%       1 301138 java.util.Random.
  10  0.02% 99.81%       2 300908 java.io.FilePermission$1.run
  11  0.01% 99.82%       1 301283 java.lang.ThreadGroup.remove
  12  0.01% 99.83%       1 300820 sun.net.www.protocol.file.Handler.createFileURLConnection
......



The java.hprof.txt file contains the number of times each method was invoked, as well as the amount of CPU time (as a percentage) that was spent in each method. To see how a method was called, you can search the profiler output for the trace identifier that is listed along side the profiling data. This will produce a Java stack trace similar to the following that shows how a given method was invoked:

TRACE 301143:
        java.util.Random.next(Random.java:Unknown line)
        java.util.Random.nextDouble(Random.java:Unknown line)
        java.lang.Math.random(Math.java:Unknown line)
        TestMain.foo(TestMain.java:Unknown line)



The BCI approach introduces a fair amount of overhead to the Java runtime. In cases were the overhead is hindering testing, you can use the agent’s “cpu=samples” option instead. This will cause the agent to sample the runtime environment at periodic intervals to see which methods are executing. While this approach is not as accurate as the BCI approach, it provides a good set of results with less runtime overhead. The java profiling agent is incredibly useful, and just one of the vast number of tools that are available to profile Java applications.

Getting alerts when Java processes crash

When bugs occur in the Java runtime environment, most administrators want to get notified so they can take corrective action. These actions can range from restarting a Java process, collecting postmortem data or calling in application support personnel to debug the situation further. The Java runtime has a number of useful options that can be used for this purpose. The first option is “-XX:OnOutOfMemoryError”, which allows a command to be run when the runtime environment incurs an out of memory condition. When this option is combined with the logger command line utility:

$ java -XX:OnOutOfMemoryError=”logger Java process %p encountered an OOM condition” …

Syslog entries similar to the following will be generated each time an OOM event occurs:

Jan 21 19:59:17 nevadadev root: [ID 702911 daemon.notice] Java process 19001 encountered an OOM condition

Another super useful option is “-XX:OnError”, which allows a command to be run when the runtime environment incurs a fatal error (i.e., a hard crash). When this option is combined with the logger utility:

$ java -XX:OnError=”logger -p Java process %p encountered a fatal condition” …

Syslog entries similar to the following will be generated when a fatal event occurs:

Jan 21 19:52:17 nevadadev root: [ID 702911 daemon.notice] Java process 19004 encountered a fatal condition

The options above allow you to run one or more commands when these errors are encountered, so you could chain together a postmortem debugging tool, a utility (logger or mail) to generate alerts, and a restarter script to start a new Java process (this assumes you aren’t using SMF). Nice!

Monitoring Java garbage collection with jstat

Java memory management revolves around the garbage collector, which is the entity responsible for traversing the heap and freeing space that is being taken up by unreferenced objects. Garbage collection makes life easier for Java programmers, since it frees them from having to explicitly manage memory resources (this isn’t 100% true, but close enough). In the Java runtime environment, there are two types of collections that can occur. The first type of collection is referred to as minor collection. Minor collections are responsible for locating live objects in the young generation (eden), copying these objects to the inactive survivor space, and moving tenured objects from the active survivor space to the old (tenured) generation (this assumes that a generational collector is being used). The second form of collection is the major collection. This type of collection frees unreferenced objects in in the tenured generation, and optionally compacts the heap to reduce fragmentation.

When debugging performance problems, it is extremely useful to be able to monitor object allocations and frees in the new and old generations. The Java development kit comes with the jstat utility, which provides a ton of visibility into what the garbage collector is doing, as well as a slew of information on how each generation is being utilized. To use jstat to display garbage collection statistics for the new, old and permanent generations, jstat can be invoked with the “-gc” (print garbage collection heap statistics) option, the “-t” (print the total number of seconds the JVM has been up) option, the process id to retrieve statistics from, and an optional interval to control how often statistics are printed:

$ jstat -gc -t `pgrep java` 5000

Timestamp        S0C    S1C    S0U    S1U      EC       EU        OC         OU       PC     PU    YGC     YGCT    FGC    FGCT     GCT   
        98772.0 1600.0 1600.0  0.0   1599.8 13184.0   5561.6   245760.0   201671.9  16384.0 6443.0 166683 2402.690 32411  110.564 2513.255
        98777.0 1600.0 1600.0 1599.4  0.0   13184.0   9533.7   245760.0   156797.1  16384.0 6443.0 166690 2402.785 32414  110.573 2513.359
        98782.0 1600.0 1600.0 1599.7  0.0   13184.0  10328.6   245760.0   166402.2  16384.0 6443.0 166698 2402.889 32416  110.580 2513.469
        98787.0 1600.0 1600.0  0.0   1599.9 13184.0   2383.5   245760.0   195366.0  16384.0 6443.0 166707 2403.016 32416  110.580 2513.595


The output above contains the size of each survivor space (S0C && S1C), the utilization of each survivor space (S0U && S1U), the capacity of eden (EC), the utilization of eden (EU), the capacity of the old generation (OC), the utilization of the old generation (OU), the permanent generation capacity (PC), the permanent generation utilization (PU), the total number of young generation garbage collection events (YGC), the total amount of time spent collecting objects in the new generation (YGCT), the total number of old generation garbage collection events that have occurred (FGC), the total amount of time spent collecting objects in the old generation (FGCT), and the total time spent performing garbage collection.

If you prefer to view garbage collection events as percentages, you can use the “-gcutil” option:

$ jstat -gcutil -t -h5 `pgrep java` 5000

Timestamp         S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT   
        99814.1   0.00  99.99  18.08  63.77  39.32 168551 2427.512 32800  111.800 2539.313
        99819.1  99.96   0.00  66.29  78.18  39.32 168562 2427.649 32800  111.800 2539.449
        99824.1 100.00   0.00  94.40  62.46  39.32 168572 2427.795 32803  111.815 2539.610
        99829.2 100.00   0.00  60.25  65.08  39.32 168580 2427.888 32806  111.824 2539.713



The output above contains the utilization of each survivor space as a percentage of the total survivor space capacity (S0 && S1), the utilization of eden as a percentage of the total eden capacity (E), the utilization of the tenured generation as a percentage of the total tenured generation capacity (O), the utilization of the permanent generation as a percentage of the total permanent generation capacity (P), the total number of young generation garbage collection events (YGC), the total time spent collection objects in the young generation (YGCT), the total number of of old generation garbage collection events (FGC), the total amount of time spent collecting objects in the old generation (FGCT), and the total garbage collection time.

To get the time spent in garbage collection along with the reason the collection occurred, jstat can be run with the “-gccause” option:

$ jstat -gccause -t `pgrep java` 1000

Timestamp         S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT    LGCC                 GCC                 
       100157.3  99.96   0.00  66.27  63.82  39.32 169160 2435.394 32925  112.202 2547.595 CMS Initial Mark     No GC               
       100158.3   0.00  99.99  32.14  67.72  39.32 169163 2435.430 32925  112.202 2547.631 unknown GCCause      No GC               
       100159.3   0.00  99.97  50.22  65.10  39.32 169165 2435.454 32927  112.208 2547.662 CMS Initial Mark     No GC               
       100160.3  99.97   0.00   6.02  62.46  39.32 169168 2435.493 32928  112.211 2547.704 unknown GCCause      No GC               
       100161.3  99.97   0.00  32.14  62.46  39.32 169168 2435.493 32928  112.211 2547.704 unknown GCCause      No GC      



There are also options to print class loader activity and hotspot compiler statistics, and to break down utilization by generation (this is extremely useful when your trying to profile a specific memory pool). There are a number of incredibly useful opensource tools for visualizing garbage collection data, and I hope to talk about these in the near future.