Best way to learn how applications work? Dtrace!

I have been spending a good bit of time trying to understand how Apache works, and started to wonder which methods could be used to understand how large software packages work. After pondering this for a bit, it dawned on me that DTrace’s flowindent and ustack() / stack() functions are ideal for reverse engineering software. To illustrate what I am talking about, say you want to see what to watch the call flow between the Apache ap_run_create_connection() and ap_run_process_connection() hooks. You can read the source code to put together a call flow diagram, or you can run the following DTrace script:

$ cat apacheflow.d

#pragma D option flowindent

 pid$target::ap_run_create_connection:entry
{
   self->follow = 1;
}

pid$target::ap_run_process_connection:return
{
   self->follow = 0
}

pid$target:::entry,
pid$target:::return
/ self->follow /
{}

$ dtrace -p `pgrep httpd` -s apacheflow.d

dtrace: script 'apacheflow.d' matched 16943 probes

CPU FUNCTION
  0  -> ap_run_create_connection
  0    -> core_create_conn
  0      -> apr_palloc
  0      <- apr_palloc
  0      -> memset
  0      <- memset
  0      -> ap_update_child_status
  0      <- ap_update_child_status
  0      -> ap_update_child_status_from_indexes
  0      <- ap_update_child_status_from_indexes

  0      -> ap_create_conn_config
  0      <- ap_create_conn_config
  0      -> create_empty_config
  0        -> apr_palloc
  0        <- apr_palloc
  0        -> memset
  0        <- memset
  0      <- create_empty_config
[ ..... ]
  0            <- ap_rgetline_core
  0            -> apr_time_now
  0              -> gettimeofday
  0              <- gettimeofday
  0            <- apr_time_now
  0            -> apr_brigade_destroy
  0              -> apr_pool_cleanup_kill
  0              <- apr_pool_cleanup_kill
  0            <- apr_brigade_destroy
  0            -> apr_brigade_cleanup
  0            <- apr_brigade_cleanup
  0          <- ap_read_request
  0        <- ap_process_http_connection

I have found this useful for watching specific call paths, and for understanding the system impacts of specific call paths. Since you can also instrument each and every instruction in a function*, Clay thinks this could be useful for viewing branches while reading the output of dis(1).

* Unfortunately you cannot view the original instruction that was executed, but it sounds like Adam Leventhal is working on a fix for this! When his fix is put back into Solaris, this will be sweeeeeet!

Leave a Reply

Your email address will not be published. Required fields are marked *