Gathering statistics from your postfix mail logs

I have been supporting postfix mail relays for several years. When you are tasked with ensuring that your SMTP relays work without issue, you need to stay on top of the logs and the current operational state of each server to prevent major problems from occurring. Problems can take the form of blacklisted relays, mail routing loops, spam attacks or people trying to exploit a known issue in the SMTP software. All of these items suck for a site that relies on mail to function.

To help me keep on top of the relays I support, I use a multi-faceted approach. I summarize the log files daily, create scripts to notify me of problems before they cause issues, and try to be proactive about my relays (as much as you can be). To summarize my logs I use a combination of custom developed scripts and the incredibly awesome pflogsumm Perl script. The output of the scripts I developed provides me with specific data, and pflogsum produces a beautiful report that summarizes the overall operation of my Postfix SMTP relays:

$ zcat /var/log/maillog-20110710.gz | pflogsumm.pl

Grand Totals
------------
messages

  19004   received
  20824   delivered
    696   forwarded
      3   deferred  (3  deferrals)
      0   bounced
      0   rejected (0%)
      0   reject warnings
      0   held
      0   discarded (0%)

  57945k  bytes received
 139101k  bytes delivered
     66   senders
     28   sending hosts/domains
    175   recipients
     31   recipient hosts/domains


Per-Day Traffic Summary
    date          received  delivered   deferred    bounced     rejected
    --------------------------------------------------------------------
    Jul  3 2011      2006       1875 
    Jul  4 2011      2212       1992          2 
    Jul  5 2011      2855       3568 
    Jul  6 2011      3212       3823 
    Jul  7 2011      3212       3962          1 
    Jul  8 2011      2581       2747 
    Jul  9 2011      2665       2654 
    Jul 10 2011       261        203 

Per-Hour Traffic Daily Average
    time          received  delivered   deferred    bounced     rejected
    --------------------------------------------------------------------
    0000-0100          78         51          0          0          0 
    0100-0200          74         51          0          0          0 
    0200-0300          99         87          0          0          0 
    0300-0400          73         52          0          0          0 
    0400-0500          73         53          0          0          0 
    0500-0600          76         68          0          0          0 
    0600-0700         110        129          0          0          0 
    0700-0800         106        134          0          0          0 
    0800-0900          84         70          0          0          0 
    0900-1000          92        103          0          0          0 
    1000-1100         103        106          0          0          0 
    1100-1200         102        103          0          0          0 
    1200-1300          95         93          0          0          0 
    1300-1400         103        116          0          0          0 
    1400-1500         108        124          0          0          0 
    1500-1600         112        142          0          0          0 
    1600-1700         114        129          0          0          0 
    1700-1800         111        125          0          0          0 
    1800-1900         135        176          0          0          0 
    1900-2000         144        194          0          0          0 
    2000-2100         134        224          0          0          0 
    2100-2200          88        101          0          0          0 
    2200-2300          82         85          0          0          0 
    2300-2400          82         88          0          0          0 

Host/Domain Summary: Message Delivery 
 sent cnt  bytes   defers   avg dly max dly host/domain
 -------- -------  -------  ------- ------- -----------
  10247    81716k       1     0.5 s    5.5 m  prefetch.net
    .....

Host/Domain Summary: Messages Received 
 msg cnt   bytes   host/domain
 -------- -------  -----------
   3094    22088k  server1.prefetch.net
     .....

Senders by message count
------------------------
   2288   badperson1@prefetch.net
   1630   badperson2@prefetch.net
     .....

Recipients by message count
---------------------------
   4415   donotspam1@prefetch.net
   3046   donotspam2@prefetch.net
     .....

Senders by message size
-----------------------
  15786k  badperson1@prefetch.net
  13414k  badperson2@prefetch.net
     .....

Recipients by message size
--------------------------
  29981k  donotspam1@prefetch.net
  25492k  donotspam2@prefetch.net
     .....

Messages with no size data
--------------------------
 D07FA650  fubar@prefetch.net
     .....

message deferral detail
-----------------------
  smtp (total: 3)
         2   25: Connection timed out
         1   lost connection with prefetch.net[1.1.1.1...

message bounce detail (by relay): none

message reject detail: none

message reject warning detail: none

message hold detail: none

message discard detail: none

smtp delivery failures: none

Warnings
--------
  smtpd (total: 14)
        14   non-SMTP command from unknown[1.1.1.1]: To: 
  trivial-rewrite (total: 430)
       430   database /etc/postfix/transport.db is older than source file /e...

Fatal Errors: none

Panics: none

Master daemon messages: none

I can’t tell you how many times I’ve run this script and seen something in the output that left me wondering WTF?!? Several times I’ve been able to use the output from this script to nip a problem before it becomes a major issue. The output is also handy for developing some additional monitoring for your SMTP services. If you are using postfix you should definitely check out pflogsumm. You will be glad you did!!!

Leave a Reply

Your email address will not be published. Required fields are marked *