|Intro|  |News|  |Threats|  |Alerts|  |Papers|  |Events|  |Reading|  |Links|  |About Me|  |Powered-by...|

We got a bumper crop this harvest, 'ma!
[Back to Main]

It used to be that spammers could nail an entire site using simple alias expansions that generated an e-mail to everyone who had a mailbox on a particular machine, but thankfully those days are mostly behind us (what version of Sendmail are you using, again?).  Now spammers need to come up with more clever ways to populate their databases.  A number of methods involve spiders (much like the Googlebot) crawling across the Internet and recording everything they see that looks like an e-mail address.  This isn't limited only to websites.  Archived news groups and e-mail lists are prime targets.  There are even spiders the crawl Usenet and common message board software looking for easy pickings.

Besides these clever methods, you have the not terribly imaginative (yet quite effective) techniques of simply buying lists from unscrupulous entities that market personal data that they collect.  A number of spammers also operate fake front sites that exist merely to get people to sign up and provide their e-mail address, then supply some weak content in return.  Even some well-known sites employ, what in my opinion are, very dishonest tactics to harvest e-mail addresses from unsuspecting visitors.  At least one version of the RealPlayer install program silently signed up the user for 5 spam lists.

All the above methods are fairly well known, and some effective ways of combating them are provided in the Spam threats section.  This section will concentrate specifically on the SMTP level harvesting attacks.  It's very easy to harvest addresses from an MTA that provides detailed information about specific addresses.  What I'm talking about is MTAs that will confirm or deny that they can deliver mail for a specific address via the VRFY or even the RCTP TO commands.  In an effort to stem the tide of "buckshot" spam (i.e. spam that is just blasted at a site in the hopes that some will go to a real address) some organizations have implemented recipient verification, often through an LDAP lookup.  While this eliminates spam messages going to addresses that don't exist, it also gives spammers a perfect way to build an exact directory of "good" addresses.

Usually you need to make a decision on the trade-off value of verifying recipients vs. having your directory harvested.  The way to make this decision easier is to ask yourself "which is more likely, that spammers will specifically target my domain to harvest an address list, or that my machines will be overrun by a flood of spam going to accounts that don't exist?".  In general, it's more likely that large or prominent organizations will be targeted for directory harvesting, while small and/or obscure domains might get bombed with a load of spam going to randomly generated addresses and their systems may be overloaded by it.

Fortunately, I'm some cases you can have the best of both worlds.  I helped to desgin a system at Tumbleweed that actually detects and blocks DHA in real-time, and once blocking is activated for a particular source their connetions are immediately rejected (no need to queue the message to inspect it).  If you evaluate some other commercial product, see if it has similar capabilities.  Some will be vague in their response to VRFY and RCPT TO (only confirm the domains that it receives for, not the individual accounts) but do a recipient check before actually relaying the mail.  In this way they can avoid having your directory harvested, but still prevent your groupware system from being bombarded by phony e-mail.  That method does concede that your SMTP gateway will queue phony messages to disk (thus using resources) before discarding them (whether that's important depends on the performance impact).

So just how can this directory harvesting be prevented in detail?  Well as outlined above, giving vague responses to VRFY and RCPT TO is an excellent start.  If you're using an Open Source relay, doing an after-the-fact LDAP lookup from your groupware system or central directory can allow you to silently drop messages for accounts that don't exist.  Of course, you probably don't have that choice in most relays, but if you're sufficiently clever you could hack it into an MTA that you have available source code for.  You can also implement tools to comb your logs looking for such attacks.

If your MTA logs contain data about the source IP of a connection and what commands it issued, you can build an anomaly detector that constantly scans your logs looking for an abnormal amount of VRFY requests from a particular IP or network.  This script can generate alerts to your NOC, or inject a deny rule into your MTA's configuration or a packet filter blacklist.  Using LDAP for verification, you can get even more intelligent by correlating a large amount of bogus recipient attempts with a particular domain, IP, or network and adding a blacklist entry based on that criteria.

Last, if you're using a legacy UNIX system as your e-mail gateway, make sure to block finger requests.  One of the early methods of address harvesting was to simply issue many finger requests (which the MTA would not be aware of) in order to construct a directory.  There is generally no reason to have fingerd enabled on Internet-facing hosts.




This site © copyright 2003-2011 Brian Keefer.  Unauthorized republication is forbidden.