If you have an email address for a while, you’ll know that “spams” are almost inevitable. Once the email address has been used, spams will find their way there sooner or later. Combating email spams has also become one of the most researched topics these days.

Greylisting is a relatively new method to use against spams, and its principle is very different from traditional content filtering/content classifying strategy. The differences make it very effective in stopping spams currently, utilising relatively little CPU time and has small memory footage. It by no means is a replacement for fiter/classifier based spam protections, but can be easily deployed as first level of defense to reduce the CPU/memory demand of your spam filters.

This article was written after successfully implementing greylisting on Postfix using Gld, on a memory-restrained VPS running Gentoo Linux. Hopefully it will be useful to those who are thinking of implementing a similar solution.

What is Greylisting?

What makes greylisting different from other filter/classifier based defense, like SpamAssassin and SpamBayes? From Greylisting website:

In name, as well as operation, greylisting is related to whitelisting and blacklisting. What happen is that each time a given mailbox receives an email from an unknown contact (ip), that mail is rejected with a “try again later”-message (This happens at the SMTP layer and is transparent to the end user). This, in the short run, means that all mail gets delayed at least until the sender tries again – but this is where spam loses out! Most spam is not sent out using RFC compliant MTAs; the spamming software will not try again later.

As most spams are sent out by bots/dumb SMTP clients, they do not bother to try again after the SMTP server replies with temporary failure, where a properly implemented SMTP MTA would continue to re-try for a certain amount of time. It effectively blocks out majority of spams without actually scanning through their contents.

For example, the following chart shows the number of spams caught by SpamAssassin on daily basis, on my mail server serving 3 real users. At the beginning of August I set up greylisting, and the number of spams per day dropped dramatically. That means I only need to run one SpamAssassin instance (which is huge at around 30Mb RSS per process), and it does not need to work that hard analysing incoming emails.

Spams/day before/after greylisting

Pretty obvious, isn’t it? Now, let’s get into how to set it up.

Requirement

You need the following packages installed.

Since I was setting it up on Gentoo Linux, I will use emerge to install these packages. In my case, MySQL is used to store the greylisting database.

# emerge -auv postfix mysql gld

It shall download, compile and install these packages if they haven’t yet been installed. Your mileage might vary, if you are using a different distribution.

Setting up Gld

First of all, you need to create a MySQL database for Gld. Start MySQL if it hasn’t been started, and let’s create a new database called gld.

$ mysql -uroot -p
mysql> CREATE DATABASE gld;
mysql> GRANT ALL PRIVILEGES ON `gld`.* to gld@localhost
       IDENTIFIED BY '<gld password>';
mysql> ALTER DATABASE `gld` DEFAULT CHARACTER SET latin1;
mysql> USE gld
mysql> \. /usr/share/gld/sql/tables.mysql

Note — on Gentoo with MySQL 4.1 onwards, the default character set is UTF8, which caused the greylisting table unable to be created as index key will be too large. Therefore it is necessary to set the database charset to “latin1″ before populating the tables.

Now, open the Gld configuration file /etc/gld.conf and start working through it. The default setting is pretty sane, and you only need to change the database configuration at the very end of the file.

#
## SQL INFOS (defaults are localhost,myuser,mypasswd,mydb)
#
#SQLHOST=localhost
SQLUSER=gld
SQLPASSWD=<gld password>
SQLDB=gld

You can then start the gld and add it to a list of start up items.

# /etc/init.d/gld start
# rc-update add gld default

Gld comes with a list of whitelist that you should consider using. If email is originated from one of the whitelisted addresses, it will be let through without greylisting. To populate the default whitelist:

# mysql -u root -p gld < /usr/share/sql/gld/table-whitelist.sql

Setting up Postfix

Setting up Postfix mail server is really outside the scope of this article, and should be left as exercise for the readers. I have following lines in my /etc/postfix/main.cf to enable SASL authentication, manual whitelist/blacklist and greylisting.

smtpd_recipient_restrictions =
    reject_unauth_pipelining,
    reject_non_fqdn_recipient,
    reject_unknown_recipient_domain,
    permit_mynetworks,
    permit_sasl_authenticated,
    check_recipient_access hash:/etc/postfix/recv_access,
    check_client_access hash:/etc/postfix/client_access,
    reject_unauth_destination,
    check_policy_service inet:127.0.0.1:2525,
    permit

Configuration files /etc/postfix/recv_access and /etc/postfix/client_access allow manual blacklist/whitelist before automated greylisting. For example if I wish to have my secondary MX (free service at The Roller Network) to be always whitelisted, I can have the following entries in my client_access file.

# mail.rollernet.us mail2.rollernet.us
208.11.75.2  PERMIT
216.90.171.2 PERMIT

Run postmap /etc/postfix/client_access to create the hash file. The same applies to recepient whitelist/blacklist in /etc/postfix/recv_access. For example to allow some critical addresses to always pass through without greylisting:

postmaster@  PERMIT
hostmaster@  PERMIT
abuse@       PERMIT

Note that I do not use any Realtime Black Lists (RBL) in my receipient restriction, as personally I found them often produce false positives. Recently Gmail SMTP servers were blocked by SpamCop and ORDB, and I have no idea how many legitimate emails I have rejected until someone contacted me via a web form on this site. It does allow open-relay mail servers to leak spams through our greylist, but I guess I am willing to accept that than loosing valuable mails.

Now, reload Postfix to activate the changes.

# /etc/init.d/postfix reload

Now, enjoy reduced spams and virii!

Tuning Greylisting

Gld came with quite a few configuration options which can be found in /etc/gld.conf. Default configuration works very well.

However, in the original greylisting whitepaper, there are three values to consider when dealing with (from, to, ip) triplets:

  1. Initial delay of a previously unknown triplet: 1 Hour
  2. Lifetime of triplets that have not yet allowed a mail to pass: 4 Hours
  3. Lifetime of auto-whitelisted triplets that have allowed mail to pass: 36 Days

Gld only allows you to change (1), that is MINTIME in the configuration file which defaults to 60 seconds — far smaller than the one hour suggested. I actually doubt that people are willing to wait for one hour for the very first email from a new correspondant to be white-listed, and the whitepaper also suggested that

The data collected during testing showed that more than 99% of the mail that was blocked with the tested setting of 1 hour would still have been blocked with a delay setting of only 1 minute.

Increase that to 5 minutes or 10 minutes or even one hour if you want, but I am leaving mine to 60 seconds.

What Gld did not provide is the option to expire greylist (2) and whitelist (3). Expiring greylist is needed to prevent spammer spamming the same address in the future, which might in turn whitelist the record. 4 hours is a bit short I found, as some properly implemented mail servers don’t re-try that often. Expiring whitelist is also needed to keep the size of database sane, as it purges those once-off records.

Fortunately Gld kept all its records in the database, and it is easy to write a script that expires those records. Here’s mine:

#!/bin/bash

# Expiring the greylist in 12 hours.
EXPIRE_GREYLIST=43200

# Expiring the whitelist in 35 days.
EXPIRE_WHITELIST=$((35 * 86400))

echo "
DELETE FROM greylist WHERE last<$((`date +%s` - EXPIRE_WHITELIST));
DELETE FROM greylist WHERE n=1 AND first<$((`date +%s` - EXPIRE_GREYLIST));
" | mysql gld

Run this in a cron job (say once every hour) to keep the Gld greylist database clean.

Why am I still getting spams?

Greylisting does not eliminate all spams, once for all. Especially when the spammers have smarten up, it is likely they will find ways to work around it. In my case I am still getting a few spams everyday, and they usually get quarantined by SpamAssassin so none of them ended up in my inbox. Analysing those that are filtered by SpamAssassin, I can conclude that:

  • Greylisting without RBL cannot block spams from open-relay mail servers. Those servers are not bots and they do retry if they failed to forward the spams to my inbox. Thus greylisting has no use here.
  • Greylisting does not work with redirected emails. I have several email addresses on other services that redirect to my main inbox, and these servers have not implemented greylisting.
  • It is easier to spam secondary MX if it does not implement greylisting.
  • Some spammers do send from legitimate email servers.

Having a content filtering helps to remove its short-comings. I use SpamAssassin + Amavisd + ClamAV, which help to almost eliminate the spams I have received everyday.

Hopefully this article is useful. Comments, corrections and suggestions are welcome.