A few things have recently come together for me. First, Andrew Leonard recently penned an interesting column on spam-blocking technology for Salon; then Jennifer Lee wrote another interesting article for The New York Times. Finally, I made use of a brief free trial of McAfee’s SpamKiller software. I’ve also just been doing a lot of thinking lately about what needs to be done to seriously address the rising tide of spam that is flooding most everyone’s inbox. Spam e-mail has been getting worse over the past several years, and it’s been getting worse at an accelerating pace. If we don’t want Internet communications to become simply worthless from being drowned by spam e-mail, then we have to rethink our basic model for e-mail so that spammers can no longer take advantage of the system’s architecture to overwhelm legitimate messages with their crap. Lee’s article shows a good grasp of the problem and why anti-spam legislation won’t do much to solve it. Leonard’s has a good grasp on the overall technological shift needed to address the problem, but he doesn’t push the envelope nearly enough in the kind of framework that needs to be accomplished.
Leonard’s article describes the development of SpamAssasin, an open source spam blocker being adopted and improved by many system administrators. Leonard points out that the collaborative effort between legions of dedicated spam-fighters can greatly improve the ability of the software to identify spam messages. As Leonard puts it, The only way to stem the flood of unwanted e-mail may be to harness a million eyeballs and an army of open-source hackers.
There’s an intuitive reason why this should be the case. Obviously, by harnessing the efforts of thousands of administrators who ferociously hate spam, it will get a big boost in productive energy. But that’s not all.
The basic problem is this: under the present e-mail architecture, the spam market works. It works phenomenally well, and especially well for the seedier side of online industries, in particular pornography and sex-related products, which can’t advertise through conventional media (other than other porn outlets) and don’t have any financial interest in maintaining a reputation as a friendly corporate citizen. The reasons are inherent features of the e-mail architecture:
It costs nearly nothing to send spam: once you have an Internet connection set up (which you’ll need for your product’s website, anyway), it costs virtually nothing to send out scads and scads of spam e-mail. Labor costs can be reduced to nill by feeding addresses from a web crawler into an automated spamming program. This is a fundamental reversal from direct mail and telemarketing, where a fixed cost for contacting a person is borne by the advertiser.
Lots of people see it: If you send out a spam message to a huge group of people, then most of the people you send it to will see it. In part, this is because e-mail is a durable medium, like direct mail or fax, and unlike the telephone, so if you send a message while the user is away, they still get it. It’s also due to the relatively primitive state of message sorting and spam filtering–users have very little control over the order and priority with which messages appear in their inboxes, so to get to the mesages they want, they generally have to wade through, or at least scan over, any spam that they get.
It’s hard to track offenders. Many comparisons have been drawn between spam e-mail and the junk faxes
whose rising costs spurred a federal law against them in 1991. The two are alike in that advertisers get a basically free contact, while victims are stuck with the primary costs (in paper, bandwidth, time, what have you) of the interactions. However, there is a crucial difference: junk faxes can easily be tracked to their perpetrator through phone company records. Offenders can be blocked and identified for legal action. Spam e-mails, on the other hand, are generally very difficult to track to their originators. Headers can easily be forged, server relays can be found to use, one-time-only addresses created with free services, work can be farmed out to mule
computer users, who are paid a small amount to send out a huge volume of messages, and then take the fall if they get caught. The anonymity of e-mail and its reliance on the honor system for identifying senders makes spam very difficult to flag and filter.
When we look at all these factors, we begin to see that we need a comprehensive solution which will work to address these structural holes. We cannot rely on anti-spam legislation, since spammers will merely relocate to different states or different countries, and use the anonymity of the communication to further shield themselves. Spam is only going to get worse until we have mass deployment of an easy-to-learn, easy-to-use, agile framework which harnesses both human intelligence and high-quality, flexible technological solutions to make legitimate email easier to access and identifies and deals with spam.
Unfortunately, most anti-spam solutions fail, because they are focused narrow-mindedly on a single goal–the goal of accumulating as many heuristic rules as possible to identify and kill
spam (this is reflected in the names–McAfee’s SpamKiller,
SpamAssasin,
and so on. The most common and most maddening manifestation of this is scorched-earth spam programs such as SpamKiller, which works entirely by accumulating thousands and thousands of rules to try to identify common patterns in the way that spam messages are written or addressed. These do indeed catch a lot of spam, but they also slam perfectly legitimate e-mail. For example, my decision to uninstall SpamKiller was finalized when I saw it was trashing legitimate e-mails because a filter (one of thousands, which took lots of scrolling to find) was killing messages because they contained the word rape.
Now, look, folks, I’m pretty much physically nauseated by some of the spam ads I’ve received for rape-fetish pornography sites. But I’m an anti-rape activist, and I receive tons of perfectly legitimate e-mail with the word rape
in it. SpamKiller’s approach to spam is like trying to kill a swarm of mosquitoes with a cluster bomb, and plenty of perfectly innocent messages were getting clobbered.
The problem here is that most people who work on spam-blocking software and most of those who purchase it are basically in the frame of mind of trying to get rid of a source of long-term and maddening irritation. Programs tend to be reactively focused on axing spam by any means necessary, rather than proactively focused on improving the e-mail user’s experience. But if we keep our mind on what users need and want, rather than what gives us the temporary satisfaction of the kill, then we should begin to see a bit more clearly what needs to be done.
To reduce the effectiveness of spam, first spam management software needs to be widespread, usable, and respectful of user’s legitimate e-mail. With millions of users employing software that lets them take control of their own inboxes, users will be able to stay on top of their legitimate e-mail and sidestep the spam. Information for identifying spam should come from automated reports that millions of users submit: when a spam slips through, the recipient presses one button in the mail client and it is registered as a spam message so that no-one else receives it (SpamAssassin uses Vipul’s Razor, a system which does just this, but it needs to be integrated into easy to use clients, not just arcane Unix mail filters).
Second, we need to plug the anonymity hole through use of double-key authentication and encryption of e-mail. E-mail clients could prioritize messages which can be verified as coming from a valid address, and also messages which are encrypted for the recipient’s eyes only. Spammers who want their messages seen would have to separately acquire a public key for, and encrypt the message for every intended recipient. For millions of e-mail addresses, that’s an awful lot of extra processor time, network bandwidth, and human labor that the spammer has to pay for. Furthermore, the spammer’s PGP signature or signatures can be blacklisted as quickly as the spams start going out.
Finally, system administrators at big ISPs need to get responsible. One of the biggest conduits for spam open relays, poorly configured mail servers which allow anyone on the Internet to send e-mail through the server by forging headers to pose as a machine on the server’s network. System administrators need to get serious about ensuring that connections are only accepted from authenticated users or legitimate machines on the ISP’s own subnet. And when spam is being sent by a user, they need to be quick about axing that user’s account.
What you can do now:
You can do some things now, both short-term and long-term, to keep yourself from being overwhelmed and work towards an Internet not being drowned in spam.
Use shield accounts for online commerce. A lot of high-end spamhouses harvest addresses by buying them from merchants such as Amazon.com. For online interactions which won’t be anything other than perfunctory receipts, it’s good to maintain a shield account (say, diespammersdie@hotmail.com or somesuch) as the address through which you interact with online stores.
Download and use PGP. You can get PGP — a great security program which will let you securely sign messages (so that the recipient can verify your identity) and/or encrypt messages (so that only the recipient can read them). The Windows version of PGP automates the process of creating and using PGP keys, and has plugins for popular Windows e-mail clients which let you use simple pushbuttons for its functions. PGP will make your e-mail more secure, and also help build an Internet environment where spammers can no longer hide behind forged headers to conceal their identities.
Look for solid anti-spam software that suits you. If you can find spam management software which suits your needs, grab it! If you’re willing to geek around a lot, SpamAssasin looks very good. Better yet, Deersoft is in the process of developing SpamAssassin Pro, a commercial product for Windows based on the SpamAssassin engine and integrated with your mail client. Unfortunately, most spam management software I’ve tried (e.g., SpamKiller) is crap.
More tips: Jennifer Lee’s article is accompanied by some tips for avoiding spam, some of which I agree with, and others of which I don’t. Unfortunately, the present spam-heavy environment is encouraging a lot of people to take up measures which cut down spam at the expense of breaking human usability of the e-mail system. Lee suggests using complex e-mail addresses, which do thwart spammers who use dictionary searches on mail services, but which also makes it hard for your friends to remember your e-mail address. She also suggests removing your e-mail from any online directories in which it may be included, which will again thwart spammers but also keep people from being able to reach you. I totally disagree with this method of spam filtering. Again, it amounts to protecting
your inbox at the cost of shredding real people’s ability to contact you. Nevertheless, some of her suggestions (such as disposable forwarding accounts for use on Usenet and bulletin boards) are solid.