Verifying web server content with checksums

While setting up monit to monitor several services I support, I decided to look for an in-depth HTTP monitoring solution to compliment the monitoring capabilities provided by monit. To be more exact, I wanted to find a monitoring solution that would validate the authenticity of the content returned by a web server. Several monitoring solutions (including monit) will issue a GET request to a web server, and check that the server replied with a 200 OK status code. This works for most situations, but it doesn’t detect content deployment snafus, or server misconfigurations (the ones that don’t generate 500 status codes). I couldn’t find an opensource software package that provided this level of in-depth monitoring, so I decided to write content-check.

Content-check is written in Bourne shell, and provides in-depth HTTP monitoring by comparing a saved SHA1 hash with a SHA1 hash generated from the content returned by a web server. If the two hashes don’t match, content-check will generate a syslog entry (which can be picked up by monit) with the logger utility, and E-mail the website administrator to let them know that the content did not hash to a known value.

To configure content-check, you first need to generate a hash for the webpage you want to monitor. This can be accomplished by passing an absolute URL to content-check’s “-g” (generate hash) option:

$ content-check -g

After you generate the hash, you will need to place the hash and the absolute URL to monitor in a text file. This file can contain multiple site / hash values, but only one site / hash pair is allowed per line. Once the file is populated with one or more sites to monitor, content-check can be invoked with the “-f” option and the file that contains the list of sites to monitor:

$ cat sites da39a3ee5e6b4b0d3255bfef95601890afd80709 da39a3ee5e6b4b0d3255bfef95601890afd80709

$ content-check -f sites

If one of the sites listed in the file doesn’t hash to the value stored in the file, an E-mail is sent to the address passed to the “-e” option (or root), and a syslog message similar to the following is generated:

Jul 21 16:27:01 neutron matty: [ID 702911 daemon.notice] Content from \ did not hash to da39a3ee5e6b4b0d3255bfef95601890afd80709

Since it is possible for web servers to break in ways that allow them to still serve content, validating the content they return is the only way to know for sure that your site is working optimally.

2 thoughts on “Verifying web server content with checksums”

  1. Hi Matty – very nice idea. I hav encountered a few problems though – and my perl skills aren’t what they used to be:

    Checksum for is 12f9acee2158cb50afb064c3affb49f522d2f0a4 (optional header: )
    Site failed checksum:
    Current checksum: 12f9acee2158cb50afb064c3affb49f522d2f0a4:
    Precomputed checksum: 12f9acee2158cb50afb064c3affb49f522d2f0a4s
    sendmail: RCPT TO: (501 : recipient address must contain a domain)
    ERROR: Unable to open logfile “/var/log/content-check”

    Hmm my email does contain a domain … is it maybe limited to .com and .net ?
    And the log file is also there and chmodded …

    This is under a Gentoo Linux system.

    Thanks a lot !

  2. Duh – so I did not forget everything.

    I blindly copied the example:

    # # This is a comment
    # [site]
    # url =
    # checksum = ac3feb8bcbffff321f7db227a1c11dc7794e2fa70
    # header = “Host:”
    # syslog = “yes”
    # email = “”
    # logfile = “/var/log/”

    the ” are the problem.

    Should be like that:

    # # This is a comment
    # [site]
    # url =
    # checksum = ac3feb8bcbffff321f7db227a1c11dc7794e2fa70
    # header = Host:
    # syslog = yes
    # email =
    # logfile = /var/log/

    And everything works.

Leave a Reply

Your email address will not be published. Required fields are marked *