Gathering statistics from your postfix mail logs


I have been supporting postfix mail relays for several years. When you are tasked with ensuring that your SMTP relays work without issue, you need to stay on top of the logs and the current operational state of each server to prevent major problems from occurring. Problems can take the form of blacklisted relays, mail routing loops, spam attacks or people trying to exploit a known issue in the SMTP software. All of these items suck for a site that relies on mail to function.

To help me keep on top of the relays I support, I use a multi-faceted approach. I summarize the log files daily, create scripts to notify me of problems before they cause issues, and try to be proactive about my relays (as much as you can be). To summarize my logs I use a combination of custom developed scripts and the incredibly awesome pflogsumm Perl script. The output of the scripts I developed provides me with specific data, and pflogsum produces a beautiful report that summarizes the overall operation of my Postfix SMTP relays:

$ zcat /var/log/maillog-20110710.gz | pflogsumm.pl

Grand Totals
------------
messages

19004 received
20824 delivered
696 forwarded
3 deferred (3 deferrals)
0 bounced
0 rejected (0%)
0 reject warnings
0 held
0 discarded (0%)

57945k bytes received
139101k bytes delivered
66 senders
28 sending hosts/domains
175 recipients
31 recipient hosts/domains


Per-Day Traffic Summary
date received delivered deferred bounced rejected
--------------------------------------------------------------------
Jul 3 2011 2006 1875
Jul 4 2011 2212 1992 2
Jul 5 2011 2855 3568
Jul 6 2011 3212 3823
Jul 7 2011 3212 3962 1
Jul 8 2011 2581 2747
Jul 9 2011 2665 2654
Jul 10 2011 261 203

Per-Hour Traffic Daily Average
time received delivered deferred bounced rejected
--------------------------------------------------------------------
0000-0100 78 51 0 0 0
0100-0200 74 51 0 0 0
0200-0300 99 87 0 0 0
0300-0400 73 52 0 0 0
0400-0500 73 53 0 0 0
0500-0600 76 68 0 0 0
0600-0700 110 129 0 0 0
0700-0800 106 134 0 0 0
0800-0900 84 70 0 0 0
0900-1000 92 103 0 0 0
1000-1100 103 106 0 0 0
1100-1200 102 103 0 0 0
1200-1300 95 93 0 0 0
1300-1400 103 116 0 0 0
1400-1500 108 124 0 0 0
1500-1600 112 142 0 0 0
1600-1700 114 129 0 0 0
1700-1800 111 125 0 0 0
1800-1900 135 176 0 0 0
1900-2000 144 194 0 0 0
2000-2100 134 224 0 0 0
2100-2200 88 101 0 0 0
2200-2300 82 85 0 0 0
2300-2400 82 88 0 0 0

Host/Domain Summary: Message Delivery
sent cnt bytes defers avg dly max dly host/domain
-------- ------- ------- ------- ------- -----------
10247 81716k 1 0.5 s 5.5 m prefetch.net
.....

Host/Domain Summary: Messages Received
msg cnt bytes host/domain
-------- ------- -----------
3094 22088k server1.prefetch.net
.....

Senders by message count
------------------------
2288 badperson1@prefetch.net
1630 badperson2@prefetch.net
.....

Recipients by message count
---------------------------
4415 donotspam1@prefetch.net
3046 donotspam2@prefetch.net
.....

Senders by message size
-----------------------
15786k badperson1@prefetch.net
13414k badperson2@prefetch.net
.....

Recipients by message size
--------------------------
29981k donotspam1@prefetch.net
25492k donotspam2@prefetch.net
.....

Messages with no size data
--------------------------
D07FA650 fubar@prefetch.net
.....

message deferral detail
-----------------------
smtp (total: 3)
2 25: Connection timed out
1 lost connection with prefetch.net[1.1.1.1...

message bounce detail (by relay): none

message reject detail: none

message reject warning detail: none

message hold detail: none

message discard detail: none

smtp delivery failures: none

Warnings
--------
smtpd (total: 14)
14 non-SMTP command from unknown[1.1.1.1]: To:
trivial-rewrite (total: 430)
430 database /etc/postfix/transport.db is older than source file /e...

Fatal Errors: none

Panics: none

Master daemon messages: none

I can’t tell you how many times I’ve run this script and seen something in the output that left me wondering WTF?!? Several times I’ve been able to use the output from this script to nip a problem before it becomes a major issue. The output is also handy for developing some additional monitoring for your SMTP services. If you are using postfix you should definitely check out pflogsumm. You will be glad you did!!!

This article was posted by Matty on 2011-07-13 13:50:00 -0400 -0400