The apache web server has become the predominant web server on the Internet for it’s scalability, standards compliance and the numerous features that come bundled with the server. As more and more features are added to apache, and as web applications evolve to meet new demands, bugs will periodically surface in applications and the web server code base itself. Since application and server bugs can lead to service failures and downtime, it is important to have a good set of tools to assist with isolating and locating problems. This article provides an introduction to debugging apache web server problems, and covers several tools and techniques that can help troubleshoot problems when they arise.

Isolating problems with apache’s single process mode

When debugging web server problems, isolating a problem is the first step in troubleshooting an issue. Since a typical apache installation runs with several processes, and in the case of the worker and event MPMs multiple threads per process, starting apache as a single process can simplify troubleshooting. There are two ways to start apache as a single process. The first method requires editing the number of processes and threads in the apache servers MPM configuration stanza. An alternative method is to use apache’s “-X” option to start the server in single process mode:

$ httpd -X

When single process mode is used, apache will not fork new processes or disassociate from the terminal. This ensures that all communications flow through one process, which allows troubleshooting procedures (e.g., attaching to the httpd process with gdb) to be used on a single process instead of multiple processes.

Debugging configuration file problems

For some websites, the apache configuration file can grow quite complex over time. When errors arise with the configuration, being able to quickly pinpoint the source of an error can reduce troubleshooting time. Apache has two methods for troubleshooting configuration problems. The first method is the httpd “-t” option, which can be used to check the apache configuration file for syntax errors:

$ httpd -t -c httpd.conf

Syntax error on line 236 of /etc/httpd/httpd.conf: LogLevel requires level keyword: one of emerg/alert/crit/error/warn/notice/info/debug

If an error is present, the configuration check option will print the line that contains the error, and a description of available options if it’s able to detect a valid configuration directive. If apache fails to start and the configuration check option reports a valid configuration file, the apache debug log level can be used to have apache log additional data to the error_log.

To enable the debug log level, the “LogLevel” directive can be set to “debug” in the apache configuration file.

Debugging script execution problems

The Apache mod_cgi module allows scripts to be executed during the request processing stage. CGI script can be written in numerous scripting languages, and allow content to be dynamically created for clients. When errors occur with a script executing inside mod_cgi, determining why the script failed to run can pose a number of challenges.

One way to debug script execution problems is with the mod_cgi “ScriptLog” directive. When ScriptLog is enabled, mod_cgi will log the output from
each CGI script that did not execute properly. This output contains the server response code, the request that was received, the response that was sent to the client (if anything was sent at all), and is a great debugging tool when script errors are produced in the error_log, or when a user complains that something isn’t working properly.

To enable CGI script error logging, the “ScriptLog” directive and the location of a log file to write errors to can be added to the apache configuration file. Once ScriptLog is enabled, mod_cgi will produce log file entries similar to the following each time a script fails to execute properly:

%% [Tue Dec 26 14:47:21 2006] GET /cgi-bin/print HTTP/1.1
%% 500 /var/tmp/apache/cgi-bin/print
%request
Accept: */*
Accept-Language: en
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/418.9.1 (KHTML, like Gecko) Safari/419.3
Connection: keep-alive
Host: 192.168.1.3:8080
%response
Script /cgi-bin/print
%stdout
Script Location /var/tmp/apache/cgi-bin/print

For busy websites and are concerned with how much storage the log file may consume, the “ScriptLogLength” directive can be used to enforce an upper bound on the size of the log.

Debugging requests and responses

When supporting web applications, there are times when a problem can be caused by an application server, proxy server or the web server itself. To troubleshoot these types of issues, it is useful to dump the HTTP requests and responses to isolate the problem to the local system, or a remote system. The apache web server module mod_dumpio can be used for this purpose, since it allows HTTP requests and/or HTTP responses to be written to the error_log.

Setting up mod_dumpio is simple and straight forward. To dump HTTP requests to the error_log, the “DumpIOInput” directive can be set to “On.” To dump HTTP responses to the error_log, the “DumpIOOutput” directive can be set to “On.” Once these directives are enabled, entries similar to the following will be written to the error_log each time a request is received, or a response is sent:

[Sat Oct 28 10:52:31 2006] [debug] mod_dumpio.c(103): mod_dumpio: dumpio_in [getline-blocking] 0 readbytes
[Sat Oct 28 10:52:31 2006] [debug] mod_dumpio.c(51): mod_dumpio:  dumpio_in (data-HEAP): 16 bytes
[Sat Oct 28 10:52:31 2006] [debug] mod_dumpio.c(67): mod_dumpio:  dumpio_in (data-HEAP): GET / HTTP/1.1rn
                 ..........

Mod_dumpio adds some additional overhead to request processing, and the configuration directives will require a server restart (you can use a graceful restart to avoid killing active connections).

Determing why an apache process hung

Apache undergoes a fair amount of testing prior to each release, but occasionally bugs can sneak through the QE process. These bugs can appear in the form of web server hangs, hard crashes and malformed content being returned to clients. When these issues appear, it is important to be able to quickly identify the source (e.g., module, core server, etc) of the problem.

One way to identify the source of a hang is by viewing a stack backtrace of the hung process. The stack backtrace will contain a list of functions that have been invoked to get to the currently executing function. Since the hang condition was most likely triggered by the currently executing function, the function name can be used as a search string in bug databases, and as the starting point in analyzing source code for problems.

There are several tools that can be used to get a stack backtrace. One tool is pstack, which prints a stack backtrace for the process id passed as an argument:

$ pstack 18756

18756:  bin/httpd -k start
 ff040628 accept   (3, 11c560, 11c54c, 1)
 0004c3c4 unixd_accept (ffbff904, 7d490, 11c3a0, 0, 2710, 0) + 10
 0004a3c0 child_main (7d490, 74400, 4e2e, 74000, 0, 74000) + 2ec
 0004a6c8 make_child (4a000, 0, 1, 5, 72c00, 74000) + ec
 0004b0e8 ap_mpm_run (72c00, 74000, 74000, 74000, 74000, 74400) + 934
 000272d8 main     (7ef18, 71c00, 73800, 73800, 0, 0) + 710
 00026618 _start   (0, 0, 0, 0, 0, 0) + 5c

On platforms that don’t come with the pstack utility, the gdb and gcore utilities can be used to get a stack backtrace from a process. To attach directly to a process with gdb and retrieve a stack backtrace, the gdb utility can be run with the “-p” option and a process identifier, and the “backtrace” command can be run in the gdb shell:

$ gdb -q -p 2575

(gdb) backtrace
#0  0x0046e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x0063b681 in accept () from /lib/tls/libpthread.so.0
#2  0x00b14814 in apr_socket_accept (new=0xbff85740, sock=0x9671538,
    connection_context=0x97115d8) at network_io/unix/sockets.c:187
#3  0x080819ce in unixd_accept (accepted=0xbff85774, lr=0x9671518, ptrans=0x97115d8) at unixd.c:466
#4  0x0807fd2e in child_main (child_num_arg=Variable "child_num_arg" is not available.) at prefork.c:621
#5  0x0807ffc2 in make_child (s=Variable "s" is not available.) at prefork.c:736
#6  0x08080050 in startup_children (number_to_start=5) at prefork.c:754
#7  0x0808089b in ap_mpm_run (_pconf=0x96730a8, plog=0x96a1160, s=0x9674f48) at prefork.c:975
#8  0x08061b08 in main (argc=3, argv=0xbff85a84) at main.c:717

Each stack backtrace will contain several pieces of useful debugging information, but finding the reason the server hung typically requires reviewing each stack frame that led to the current frame, and selectively dumping server data structures. Debugging in this manner takes time, and may not be appropriate for sites that require constant availability. To assist with root cause analysis and offline debugging, the gcore utility can be used to generate a core file from a hung web server process. This core file can be analyzed by system administrators or operating system vendors to determine why the web server process hung.

The following example shows how to use the gcore utility to force a hung process to dump core, and then shows how the gdb utility can be used to retrieve a stack backtrace from the core file: $ gcore 5649

$ gdb -q /usr/sbin/httpd core.5649

(gdb) backtrace
#0  0x0046e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x0063b681 in accept () from /lib/tls/libpthread.so.0
#2  0x00b14814 in apr_socket_accept (new=0xbff85740, sock=0x9671538, 
    connection_context=0x97115d8) at network_io/unix/sockets.c:187
#3  0x080819ce in unixd_accept (accepted=0xbff85774, lr=0x9671518, ptrans=0x97115d8) at unixd.c:466
#4  0x0807fd2e in child_main (child_num_arg=Variable "child_num_arg" is not available.) at prefork.c:621
#5  0x0807ffc2 in make_child (s=Variable "s" is not available.) at prefork.c:736
#6  0x08080050 in startup_children (number_to_start=5) at prefork.c:754
#7  0x0808089b in ap_mpm_run (_pconf=0x96730a8, plog=0x96a1160, s=0x9674f48) at prefork.c:975
#8  0x08061b08 in main (argc=3, argv=0xbff85a84) at main.c:717

In the pstack, gdb and gcore examples above, we can see that apache was in the accept() system call when a SIGSEGV signal was received, and accept() was called by the portable runtime method apr_socket_accept(). If a problem was present in the server, these methods could be fed into the apache bug database to see if the problem is caused by a well known issue. For further details on advanced debugging techniques, please see the references.

Determing why an apache process crashed

If apache faults due to an alignment or segmentation violation, tools such as gcore and pstack aren’t of much use, since the process will be forced to exit. Prior to the process exiting, a core file will typically be written to the directory that is defined in the ServerRoot directive. In order for the core file to be written to this directory, the user id that the httpd process runs as will need to be have permission to write to this directory, and the “core file size” resource limit will need to be set to a value large enough to allow a core file to be created. If security policy prohibits allowing the user to write directly to the ServerRoot, the “CoreDumpDirectory” directive can be used to control where core files are written.

On some platforms this is still insufficient, since core files are not generated reliably. If you are experiencing server problems on once of these platforms, the mod_backtrace module can be used to supplement the platform’s inability to reliably create core files. Mod_backtrace works by enabling a signal handler for each non-recoverable signal (e.g., SIGSEGV, SIGBUS, etc.). When one of the non-recoverable signals is received, mod-backtrace will call a platform specific library call to generate a stack trace for the process, and then write this stack trace to the error_log.

Mod_backtrace is available as source code, and can retrieved from the apache contributors website:

$ wget http://people.apache.org/~trawick/mod_backtrace.c

To compile the source code and install the module into the apache modules directory, the apache apxs utility can be be run with the “-c” option to compile the module, and the “-i” option to install the module:

$ apxs -ci mod_backtrace.c

In order to use mod_backtrace, the apache web server will need to be built with the exception hook. This can be enabled at configure time by adding the “–enable-exception-hook” to the configure command line. Once the module is compiled, installed and the exception hook enabled, the module can be loaded with the “LoadModule” directive, and the “EnableExceptionHook” directive can be set to “On” to enable mod_backtrace. For reference, here are the apache configuration file entries needed to enable mod_backtrace:

LoadModule backtrace_module modules/mod_backtrace.so 
EnableExceptionHook On

After the module is enabled, an entry similar to the following will be written to the error_log each time apache receives a critical signal:

$ kill -SIGSEGV 18745

[Wed Dec 20 16:48:32 2006] pid 18745 mod_backtrace backtrace for sig 11 (thread "pid" 18745)
[Wed Dec 20 16:48:32 2006] pid 18745 mod_backtrace main() is at 26bc8
/var/tmp/apache/modules/mod-backtrace.so:bt_exception_hook+0x108
/var/tmp/apache/bin/httpd:ap_run_fatal_exception+0x34
/var/tmp/apache/bin/httpd:0x32788
/lib/libc.so.1:0xc01dc
/lib/libc.so.1:0xb52d4
/lib/libc.so.1:_so_accept+0x8 [ Signal 11 (SEGV)]
/var/tmp/apache/bin/httpd:unixd_accept+0x10
/var/tmp/apache/bin/httpd:0x3a3c0
/var/tmp/apache/bin/httpd:0x3a6c8
/var/tmp/apache/bin/httpd:0x3a798
/var/tmp/apache/bin/httpd:ap_mpm_run+0x9d4
/var/tmp/apache/bin/httpd:main+0x710
/var/tmp/apache/bin/httpd:_start+0x5c
[Wed Dec 20 16:48:32 2006] pid 18745 mod_backtrace end of backtrace

Mod_backtrace uses the backtrace() routine on GNU systems, and the printstack() method on Solaris systems to programatically retrieve a backtrace. In some cases symbol names may not be resolved correctly. If this happens, the nm, readelf and elfdump utilities can be used to supplement the information provided by mod_backtrace. To use mod_backtrace on Solaris hosts, you will need to apply a patch to the mod_backtrace source code. A link to this patch is provided in the reference section of this article.

Debugging with maintainer mode and the apache GDB macros

Debugging application failures through core files is a fine art, and one that takes a lot of persistence and knowledge of the application that is being debugged. The apache developers realized that good tools are required to debug apache crashes, and integrated two key features into apache to aide debugging. The first feature is the apache maintainer mode, which will build the web server with debugging symbols, and without any optimizations. This is useful when using a source debugger with the web server, since functions will not be optimized out, and the C source code can be displayed in the debugger while debugging problems. To enable maintainer mode, the “–enable-maintainer-mode” flag can be passed to Apache during the configure stage of the build process.

The second debugging feature is a set of GDB macros. These macros allow you to easily view the contents of complex apache data structures (e.g., filter chains, buckets and brigades, etc.), which is useful for performing post mortem debugging from a core file. The macros are distributed in a file named .gdbinit, which is located in the top level directory of the apache source code. If you are interested in learning more about these macros, please see the references for two presentations that describe their use.

Using the apache source code to locate problems

When log files, core files, and system utilities are unable to pinpoint a problem with the apache web server, turning to the web server source code can be helpful. The web server source code is organized as a hierarchy of directories, and each directory contains the source files for a specific facility (e.g., modules, portable runtime, etc.). As of the 2.2.3 release, the source code is divided up into the following directories:

$SRCROOT/server - contains the source code for the core server
$SRCROOT/include - contains the header files for the core server
$SRCROOT/srclib/apr - contains the the source code for the portable runtime environment
$SRCROOT/srclib/apr-util - contains the  source code for the portable runtime utility code
$SRCROOT/support - contains the source code for the utilities
$SRCROOT/modules - contains the source code for the modules

If a stacktrace or debugging session shows that a specific routine is at fault, the find command can be used to locate the C code that comprises that function (or macro). To find the definition of the portable runtime function apr_array_make, the find utility can be used to locate the header file that contains the function prototype:

$ find . -name \*.h | xargs egrep "apr_array_make"

./srclib/apr/include/apr_tables.h:APR_DECLARE(apr_array_header_t *) apr_array_make(apr_pool_t *p, ...
                   ........

Once the function prototype is located, the find and grep utilities can be used to to locate the source file that contains the function definition:

$ find . -name \*.c | xargs egrep "apr_array_header_t.*apr_array_make"

/srclib/apr/tables/apr_tables.c:APR_DECLARE(apr_array_header_t *) apr_array_make(apr_pool_t *p, ...
                   .........

In the output above, the apr_array_make function prototype is located in the file apr_tables.h, and the function definition is located in the file apr_tables.c. Depending on how functions (and macros) are defined, the results may require some additional visual processing.

Conclusion

As web server administrators, we try to take the necessary steps to ensure that our servers are available 24x7. This is not always possible, and having good tools available to troubleshoot problems is a necessity. This article discussed a couple of tools that can assist with debugging, and described a few techniques that can be used to locate problems with servers running the apache web server

References

The following references were used while writing this article:

Acknowledgements

Ryan would like to thank the apache developers for their contributions to the worlds most popular web server.

Originally published in the July ‘07 issue of SysAdmin Magazine