Postfix Stress-Dependent Configuration


Overview

This document describes the symptoms of Postfix SMTP server overload. It presents permanent main.cf changes to avoid overload during normal operation, and temporary main.cf changes to cope with an unexpected burst of mail. This document makes specific suggestions for Postfix 2.5 and later which support stress-adaptive behavior, and for earlier Postfix versions that don't.

Topics covered in this document:

Symptoms of Postfix SMTP server overload

Under normal conditions, the Postfix SMTP server responds immediately when an SMTP client connects to it; the time to deliver mail is noticeable only with large messages. Performance degrades dramatically when the number of SMTP clients exceeds the number of Postfix SMTP server processes. When an SMTP client connects while all Postfix SMTP server processes are busy, the client must wait until a server process becomes available.

SMTP server overload may be caused by a surge of legitimate mail (example: a DNS registrar opens a new zone for registrations), by mistake (mail explosion caused by a forwarding loop) or by malice (worm outbreak, botnet, or other illegitimate activity).

Symptoms of Postfix SMTP server overload are:

Legitimate mail that doesn't get through during an episode of Postfix SMTP server overload is not necessarily lost. It should still arrive once the situation returns to normal, as long as the overload condition is temporary.

Service more SMTP clients at the same time

One measure to avoid the "all server processes busy" condition is to service more SMTP clients simultaneously. For this you need to increase the number of Postfix SMTP server processes. This will improve the responsiveness for remote SMTP clients, as long as the server machine has enough hardware and software resources to run the additional processes, and as long as the file system can keep up with the additional load.

Spend less time per SMTP client

When increasing the number of SMTP server processes is not practical, you can improve Postfix server responsiveness by eliminating delays. When Postfix spends less time per SMTP session, the same number of SMTP server processes can service more clients in a given amount of time.

Disconnect suspicious SMTP clients

Under conditions of overload you can improve Postfix SMTP server responsiveness by hanging up on suspicious clients, so that other clients get a chance to talk to Postfix.

More information about automatic stress-adaptive behavior is in section "Automatic stress-adaptive behavior".

Temporary measures for older Postfix releases

See the next section, "Automatic stress-adaptive behavior", if you are running Postfix version 2.5 or later, or if you have applied the source code patch for stress-adaptive behavior from the mirrors listed at http://www.postfix.org/download.html.

The following measures can be applied temporarily during overload. They still allow most legitimate clients to connect and send mail, but may affect some legitimate clients.

1  /etc/postfix/main.cf:
2      smtpd_timeout = 10
3      smtpd_hard_error_limit = 1
4      smtpd_junk_command_limit = 1

With these measures, no mail should be lost, as long as these measures are used only temporarily. The next section of this document introduces a way to automate this process.

Automatic stress-adaptive behavior

Postfix version 2.5 introduces automatic stress-adaptive behavior. This is also available as a source code patch for Postfix versions 2.4 and 2.3 from the mirrors listed at http://www.postfix.org/download.html.

It works as follows. When a "public" network service such as the SMTP server runs into an "all server ports are busy" condition, the Postfix master(8) daemon logs a warning, restarts the service (without interrupting existing network sessions), and runs the service with "-o stress=yes" on the server process command line:

80821  ??  S      0:00.24 smtpd -n smtp -t inet -u -c -o stress=yes

Normally, the Postfix master(8) daemon runs such a service with "-o stress=" on the command line (i.e. with an empty parameter value):

83326  ??  S      0:00.28 smtpd -n smtp -t inet -u -c -o stress=

Services that have local access only never have "-o stress" parameters on the command line. This includes services internal to Postfix such as the queue manager, and services that listen on a loopback interface only, such as after-filter SMTP services.

The "stress" parameter value is the key to making main.cf parameter settings stress adaptive. The following settings are the default with Postfix 2.6 and later. With earlier Postfix versions that have stress-adaptive support, append the lines below to the main.cf file and issue a "postfix reload" command:

1 smtpd_timeout = ${stress?10}${stress:300}s
2 smtpd_hard_error_limit = ${stress?1}${stress:20}
3 smtpd_junk_command_limit = ${stress?1}${stress:100}

Translation:

The syntax of ${name?value} and ${name:value} is explained at the beginning of the postconf(5) manual page.

NOTE: Please keep in mind that the stress-adaptive feature is a fairly desperate measure to keep some legitimate mail flowing under overload conditions. If a site is reaching the SMTP server process limit when there isn't an attack or bot flood occurring, then either the process limit needs to be raised or more hardware needs to be added.

Detecting support for stress-adaptive behavior

To find out if your Postfix installation supports stress-adaptive behavior, use the "ps" command, and look for the smtpd processes. Postfix has stress-adaptive support when you see "-o stress=" or "-o stress=yes" command-line options. Remember that Postfix never enables stress-adaptive behavior on servers that listen on local addresses only.

The following example is for FreeBSD or Linux. On Solaris, HP-UX and other System-V flavors, use "ps -ef" instead of "ps ax".

$ ps ax|grep smtpd
83326  ??  S      0:00.28 smtpd -n smtp -t inet -u -c -o stress=
84345  ??  Ss     0:00.11 /usr/bin/perl /usr/libexec/postfix/smtpd-policy.pl

You can't use postconf(1) to detect stress-adaptive support. The postconf(1) command ignores the existence of the stress parameter in main.cf, because the parameter has no effect there. Command-line "-o parameter" settings always take precedence over main.cf parameter settings.

If you configure stress-adaptive behavior in main.cf when it isn't supported, nothing bad will happen. The processes will run as if the stress parameter always has an empty value.

Forcing stress-adaptive behavior on or off

You can manually force stress-adaptive behavior on, by adding a "-o stress=yes" command-line option in master.cf. This can be useful for testing overrides on the SMTP service. Issue "postfix reload" to make the change effective.

Note: setting the stress parameter in main.cf has no effect for services that accept remote connections.

1 /etc/postfix/master.cf:
2     # =============================================================
3     # service type  private unpriv  chroot  wakeup  maxproc command
4     # =============================================================
5     # 
6     smtp      inet  n       -       n       -       -       smtpd
7         -o stress=yes
8         -o . . .

To permanently force stress-adaptive behavior off with a specific service, specify "-o stress=" on its master.cf command line. This may be desirable for the "submission" service. Issue "postfix reload" to make the change effective.

Note: setting the stress parameter in main.cf has no effect for services that accept remote connections.

1 /etc/postfix/master.cf:
2     # =============================================================
3     # service type  private unpriv  chroot  wakeup  maxproc command
4     # =============================================================
5     # 
6     submission inet n       -       n       -       -       smtpd
7         -o stress=
8         -o . . .

Other measures to off-load zombies

OpenBSD spamd implements a daemon that handles all connections from "new" clients. Only well-behaved mail clients are allowed to talk to the mail server. Other clients are tarpitted, and will never get a chance to affect mail server performance.

At some point in the future, Postfix may come with a simple front-end daemon that does basic greylisting and pipelining detection to keep zombies and other ratware away from Postfix itself. This would use the "pass" service type which has been available in stable Postfix releases since Postfix 2.5.

Credits