Posts filed under Spam

Article posted 26 October 2004 Technical Difficulties posted 12am

I’ve been set back a bit on my posting schedule, thanks to a delightful combination of factors: I was going to try to integrate TypeKey identification into my comments sections in order to help control spam; but first I found out that I’d have the joy of having to reformat my hard drive and get back up to speed from a clean install and, thank God, up-to-the-minute full backups. (It’s a tale of woe, but also a boring one. Don’t ask.)

TypeKey is, I think, close to completed. Immediate effects you may notice: (1) you can now sign in using your TypeKey identity when posting comments on individual entries; (2) if you don’t sign in using a TypeKey identity, your comments will be held up momentarily for moderation. It’s not that I don’t love you all, it’s just that spammers run amok here as elsewhere, and I’ve had to deal with one no-conscience asshole too many lately. If you do comment pretty frequently here, and you don’t have one already, it may speed things a bit for you if you sign up for a TypeKey identity (all you need is a valid e-mail address). If you don’t, that’s fine too–your comments will just go up as soon as I have the chance to screen them.

Let me know what you think about the changes; and forgive me if things are a little dusty in the comments department for the rest of the night. Actual content should be resuming within (I hope!) the next day.

Article posted 12 October 2004 Dear Porn-spamming Morons posted 11pm

Dear Porn-spamming Morons:

Just so you know, you are wasting your time.

Posting the URI of your website in the comments sections of my weblog does not increase your Google ratings, for even a split-second. My weblog software routes all links in the comments section through a redirection script; since search engines don’t see the address of your site anywhere on my pages, you get no Google-bombing benefit from it. If your aim is to boost the search engine ratings of your silly little wank sites, you are wasting your time.

Spam comments are usually deleted from my website within a matter of minutes. Your spam disappears and your IP address is banned from posting again. If your aim is to drive people who happen upon my website to your silly little wank sites, you are still wasting your time.

I won’t tell you to stop. It’s a little bit of annoyance for me to zap your comments as they come in, but I think that the pornography you are peddling is misogynist, pernicious, and ultimately very sad stuff. So I’m glad to put up with a little bit of annoyance, when I have the minor satisfaction of knowing that you are wasting your time, and your sponsors’ money, posting empty spam comments that spend 15 minutes or so doing nothing on a pedantic anarcha-feminist boy’s website before they are deleted. If you want to continue wasting your time, by all means do so.

But you are wasting your time. Just so you know.

Article posted 23 August 2003 A Brief Open Letter to an Anti-Semite posted 3pm / revised 8pm 25 March 2008

Do not worry, gentle reader: I’ll be wrapping up my discussion of recent events surrounding Roy Moore soon. First, however, I have a minor personal matter to get out of the way: a brief open letter to whoever it is who has been posting a number of comments lately to several unrelated entries on Geekery Today and my Letters to the Editor:

Dear Anti-Semitic Asshole:

I don’t know who you are, and I don’t know why you feel that my letters on the war in Afghanistan, prison overcrowding in Alabama, and other topics are crying out for incisive commentary like the following:

AMNESTY INTERNATIONAL or ZIONAL? ** – Are there a true Amnesty International office in Swedish Kingdom? – No!.. We have multi-faced Amnesty Bolaget in SvekJa Kingdom!.. – What does “Bolag” means? – Financial coup runs by the Evangelian Jewish lobbies… Administration serves for the Zionist Imperialism, if you listen to the insider analysis…

[… And so on, and so forth …]

Whatever your reasons, however, this web page is not the place for off-topic, anti-Semitic diatribes. While I want to provide an open forum for commentary, including those with whom I strongly disagree, your long-wided conspiracy theory postings about the International Jewish Conspiracy^TM‘s long arms in Swedish affairs do not have anything at all to do with any past comments or with any of the content on the pages. They are nothing more than hit and run spam that is wasting perfectly good space in my disk quota. They have, therefore, been deleted.

For some time now, I’ve wanted to put together and post some of my thoughts on the issue of anti-Semitism on the Left, and the Israeli/Palestinian conflict. What I have to say consists mostly of warnings from more than one direction–I think we need to be very critically aware both of the way that charges of anti-Semitism are wrongly used to abuse and dismiss critics of Israeli government policy, and also of the dangerously uncritical and cavalier atmosphere that the Left generally and the anti-occupation movement particularly have begun to take towards real, mounting problems of anti-Semitism in the movement and in the world at large–problems that need to be seriously faced down and critically confronted if we intend to do any work for justice. However, thanks to you, my dear anti-Semite, I will have to put all that on hold for the time being, because you offer no opportunity for constructive engagement or critical dialogue. You offer only off-topic bullshit that wastes perfectly good disk space on my web host.

Please do not continue to spread this blight on my web-page. You can use the time for many other productive purposes, such as seriously examining the worrisome trend of anti-Semitism in the international Left, or reading more about what horrors sprung up when the dragon’s teeth of anti-Semitism were last sown across the European continent.

Sincerely,

Charles W. Johnson

Author and Editor of Geekery Today

Article posted 8 August 2002 The A-List: More On Spam Management posted 3am / revised 6pm 4 April 2007

Steve Outing recently published a column bemoaning the sorry state of spam filtering software today. I think his article goes off in entirely the wrong direction, but it touches on some important issues that I only raised in passing in my previous post on spam management, and which I would like to take the time to expand on.

E-mail is the killer app of the Internet. Nothing, not even web surfing (and I say this with all due respect to you, gentle reader) matters a whit compared to the vast significance that e-mail has had on our every-day communications. Not only has e-mail enabled us to get in one-on-one contact with (potentially) anyone anywhere in the world for virtually free, it has also created and nurtured the listserv, where any group of people–small or large, far-flung or local–can instantly, reliably, and painlessly communicate with one another and collaborate on projects, whether this takes the form of a one-to-many newsletter, or a many-to-many discussion group. The cost is often nothing more than some time and labor to administer the list, and the fixed costs of Internet access. This medium for cheap/free communications has created vital communities all over the Internet, more or less single-handedly made the open source movement possible, and changed the face of every form of communication, from news to activism to comic strips. But as Outing points out, there is something rotten in the state of Denmark. As he argues in his article, the wars between spammers and bad spam filters are beginning to seriously impair the usability of the medium.^*

How does this happen? Well, think about the nature of a listserv for example. When you receive a message from a listserv, it is one of many–potentially thousands or even more–copies of the same message being sent out to many users. The From: header is an address that you may have never personally contacted, or even be a newsletter robot rather than a real person. The To: header may be the resender address rather than your address–making it appear forged. And content is included such as Remove me from this list, a website to go to for more information on the list, and perhaps even content such as routine fundraising appeals. It may also contain trip words such as viagra which are actually a part of ordinary conversation, but set off spam alerts. If you’re familiar with how spam filtering works, you’ll see that the M.O. of a listserv is well-nigh indistinguishable from the M.O. of a spammer as far as a spam filter can see. The only clear difference between spam and legitimate listserv traffic is that you have voluntarily opted in for the listserv traffic, and you can voluntarily opt out; whereas spammers don’t care what you think or what you want, and barrage you with their messages without asking you for permission and without caring whether or not you’ve told them not to. This is all the difference in the world, of course, but it’s not a difference that spam filters have any way of seeing. It happens off-camera because there’s no way right now for a spam filter to monitor which listservs you have signed on for, or which e-mails come from those listservs.

The spam situation is growing intolerable. But spam filters that kill listservs and newsletters are unacceptable. Some of Hotmail’s more over-zealous versions of its junk filter have trashed perfectly legitimate personal e-mails and listserv e-mails, and since the user set up the Junk Mail folder specifically so she doesn’t have to look at it, she ends up not knowing that anyone wrote or that her request to join the listserv has been honored. And since she doesn’t know these messages are coming in, she has no reason to change her junk filter so she never changes it so that new communications can get in. This is the absolute worst way to respond to spam: the whole point of spam management is supposed to be improving the user experience of reading e-mail, not breaking the e-mail system to the point of unusability.

As I said in my previous post, one of the first principles of a good spam solution is that it must be respectful of users’ legitimate e-mail. Spam blocking must never ever interfere with properly receiving legitimate e-mail. If it does somehow block a legitimate e-mail, this fact should be easily detectable, the e-mail should be easily recoverable, and measures should be taken to ensure that it doesn’t happen again. One crucial aspect of respecting legitimate e-mail is that spam management software must be aware of listservs and respectful of e-mail sent over them.

As Outing points out, for usability and responsibility’s sake, system administrators should not take control of content-based spam-screening out of the hands of users. There’s an problem in designing spam solutions that rarely gets discussed: in spam filtering, your system administrator’s interests may not be the same as yours. Filters imposed on everyone by the administrator can make a big effect in reducing the load on mail servers, but individual filters that users apply in their clients doesn’t reduce the load on mail servers at all. And it’s not the admin’s personal e-mail which gets deleted or bounced in false positives. Therefore, admins tend to be comfortable with very strict spam filter rules even if they block a lot of legitimate e-mail. End-users, on the other hand, don’t care much about system load, and want to receive all the legitimate e-mail that’s sent to them. They want spam filtering that is flexible, fine-grained, and which they can adjust according to their own levels of comfort and trust.

As users, we’re just going to have to be up-front and hold our system administrators accountable on this issue. We’ll take our business elsewhere if we start losing messages because of your spam filtering. There are still plenty of milder rules that system administrators can impose that will block significant amounts of spam–such as rejecting e-mail from open relays, using Vipul’s Razor to bounce messages that can be proven to be spam, or setting up "trap" accounts to identify spam attacks in progress and temporarily block off the source of the deluge. However, content filtering based on heuristics and trigger words should only be done at the client level, not by sysadmins.

All well and good; so what can we do to deal with the situation?

Listserv software needs to get its act together. It’s absolutely maddening to try to put together e-mail filters for listservs, because there is absolutely no standardization of how to indicate that messages are being sent over a listserv. And once you get a filter set up, the list owner may change software or robot addresses along the way, at which point you have to through out all your work and make a new filter. All listserv software should standardize on a common set of headers to indicate that e-mail has been sent out over a listserv, as well as some information about that listserv (such as a web page with more information, addresses for unsubscribe/subscribe, and so on). Any standard scheme which is expressive enough will do. They should also develop standardized methods for alerting the list robot that you wish to subscribe to or unsubscribe from the list. With standard features in place, it will be much easier for users to write and maintain filters for dealing with listserv traffic. Also, intelligent spam software can now relax its rules when it is examining list traffic, since we know that we’re dealing with a listserv.
Listservs should authenticate themselves. If we do adopt a standard scheme, and that scheme is used to relax spam filtering rules, then you can bet that spammers will immediately start forging listserv headers to make sure that their e-mails get through. The solution to this problem is to return to a principle I mentioned in the previous post: double-key encryption should be used to authenticate genuine messages. Basically, each listserv gets a unique double-key ID. When a user subscribes, she gets the public key as part of the subscription. Whenever a message is sent out over the listserv, the listserv software uses the private key to add its signature to the message as it is resent. Smart mail readers can then allow you to filter mail based on the ID of its resender rather than headers that might be forged.
Listserv operations must be integrated with mail readers. So far everything has pretty much been minor reforms in the already-established interface for listservs and e-mail filtering. the purpose. However, this is much more of a quantum leap solution. To really improve the usability of listservs and deal with spam lists, e-mail clients themselves should be aware of and work with listservs.

Here’s an example that should be obvious and trivial. The closest thing there is to a standard identifier for listserv resending (Majordomo doesn’t have it, but most other products do) is the List-Unsubscribe: header. This header gives the user a URL — usually just a mailto: link for the list robot — where the user can unsubscribe from the list. It’s really a bit puzzling that no major e-mail software that I’m aware of (drop me a comment if you know of any counter-examples!) creates a quick “Unsubscribe from this List” button when it encounters a message with the List-Unsubscribe: header. This would instantly make things easier on millions of listserv users, and prevent thousands of misdirected Please unsubscribe me! messages from flooding listservs.

Let’s dig down a bit more. We suggested a standardized method for subscribing to any listserv, and subscription requests can be mechanically generated if they are given the address of the list that is being subscribed to. Then a smart client could feature a Subscribe… button where users can enter the address of any listserv and the subscription will be automagically handled by the mail reader. Do you see where this is going? That’s right: client integration would solve the off-camera problem once and for all. All subscriptions and unsubscriptions are being handled by automated modules in the client. The reader can track these actions and maintain an accurate list of all the mailing lists the user is on. By using double-key IDs to prevent forging, this list can be made pretty darn near airtight. Now spam filters could separate the wheat from the chaff. They could intelligently relax their rules when they are scanning confirmed listserv e-mails, to avoid false positives, and jack them back up again when scanning non-listserv e-mails, to catch more spam.
Finally, listservs must not become conduits for spam. Some spamhouses are already paying ordinary Internet users to act as mules — they sign up with their own e-mail address on a big listserv, confirm that the address is valid, and then send out a spam message through the listserv. I’ve seen attacks like this inundate more than one e-mail list that I’m on, not to mention the listservs which have been inundated with the Klez worm. Even a perfect scheme for identifying e-mail from listservs won’t matter, if the listservs themselves are resending spam or viruses. Listserv software authors should think about incorporating antivirus screening into their products, to prevent lists from being a dangerous vector of computer infectants. And list administrators need to use whatever options are available in their list software to prevent spam from being sent out. The obvious way is by requiring posts to be approved by a moderator who can screen for spam content. Some software (such as Yahoogroups) has improved on this by allowing listservs to be set so that only a user’s first message is screened by the moderator, and later ones are automatically approved. This way, a spammer will get blocked, since her first and only message is a spam message. However, the list moderator won’t be bothered with having to approve every message from regular list users.

Unfortunately, we can’t much count on listserv or newsletter patrons to come up with these solutions; most aren’t terribly tech-oriented, so we end up with columns like Outing’s, which just focuses on the problem rather than any solution, and where short-term solutions that authors can take are discussed but long-term technological solutions to the underlying problem are not discussed. And we can’t count on your average listserv author to come up with it either. The only people really concerned with ease-of-use in the listserv software world are big corporate entities like Yahoo! and Topica, who make their products easy for end-users but resist big technological jumps. And the only listserv software authors who make those jumps are clever Unix geeks who consider “arcane” to be a compliment when applied to software (take Majordomo–please!).

I love listservs and I pray that spam will not render them useless. I hope that some of us out there who still believe in usability but who don’t have much to lose from big experiments, can start creatively working out and implementing a framework something like this one, in order to prevent listservs from dying a death of the thousand spams. Read the rest of The A-List: More On Spam Management …

Article posted 15 July 2002 The Solution to Spam Pollution posted 3am / revised 1pm 29 October 2008

A few things have recently come together for me. First, Andrew Leonard recently penned an interesting column on spam-blocking technology for Salon; then Jennifer Lee wrote another interesting article for The New York Times. Finally, I made use of a brief free trial of McAfee’s SpamKiller software. I’ve also just been doing a lot of thinking lately about what needs to be done to seriously address the rising tide of spam that is flooding most everyone’s inbox. Spam e-mail has been getting worse over the past several years, and it’s been getting worse at an accelerating pace. If we don’t want Internet communications to become simply worthless from being drowned by spam e-mail, then we have to rethink our basic model for e-mail so that spammers can no longer take advantage of the system’s architecture to overwhelm legitimate messages with their crap. Lee’s article shows a good grasp of the problem and why anti-spam legislation won’t do much to solve it. Leonard’s has a good grasp on the overall technological shift needed to address the problem, but he doesn’t push the envelope nearly enough in the kind of framework that needs to be accomplished.

Leonard’s article describes the development of SpamAssasin, an open source spam blocker being adopted and improved by many system administrators. Leonard points out that the collaborative effort between legions of dedicated spam-fighters can greatly improve the ability of the software to identify spam messages. As Leonard puts it, The only way to stem the flood of unwanted e-mail may be to harness a million eyeballs and an army of open-source hackers. There’s an intuitive reason why this should be the case. Obviously, by harnessing the efforts of thousands of administrators who ferociously hate spam, it will get a big boost in productive energy. But that’s not all.

The basic problem is this: under the present e-mail architecture, the spam market works. It works phenomenally well, and especially well for the seedier side of online industries, in particular pornography and sex-related products, which can’t advertise through conventional media (other than other porn outlets) and don’t have any financial interest in maintaining a reputation as a friendly corporate citizen. The reasons are inherent features of the e-mail architecture:

It costs nearly nothing to send spam: once you have an Internet connection set up (which you’ll need for your product’s website, anyway), it costs virtually nothing to send out scads and scads of spam e-mail. Labor costs can be reduced to nill by feeding addresses from a web crawler into an automated spamming program. This is a fundamental reversal from direct mail and telemarketing, where a fixed cost for contacting a person is borne by the advertiser.
Lots of people see it: If you send out a spam message to a huge group of people, then most of the people you send it to will see it. In part, this is because e-mail is a durable medium, like direct mail or fax, and unlike the telephone, so if you send a message while the user is away, they still get it. It’s also due to the relatively primitive state of message sorting and spam filtering–users have very little control over the order and priority with which messages appear in their inboxes, so to get to the mesages they want, they generally have to wade through, or at least scan over, any spam that they get.
It’s hard to track offenders. Many comparisons have been drawn between spam e-mail and the junk faxes whose rising costs spurred a federal law against them in 1991. The two are alike in that advertisers get a basically free contact, while victims are stuck with the primary costs (in paper, bandwidth, time, what have you) of the interactions. However, there is a crucial difference: junk faxes can easily be tracked to their perpetrator through phone company records. Offenders can be blocked and identified for legal action. Spam e-mails, on the other hand, are generally very difficult to track to their originators. Headers can easily be forged, server relays can be found to use, one-time-only addresses created with free services, work can be farmed out to mule computer users, who are paid a small amount to send out a huge volume of messages, and then take the fall if they get caught. The anonymity of e-mail and its reliance on the honor system for identifying senders makes spam very difficult to flag and filter.

When we look at all these factors, we begin to see that we need a comprehensive solution which will work to address these structural holes. We cannot rely on anti-spam legislation, since spammers will merely relocate to different states or different countries, and use the anonymity of the communication to further shield themselves. Spam is only going to get worse until we have mass deployment of an easy-to-learn, easy-to-use, agile framework which harnesses both human intelligence and high-quality, flexible technological solutions to make legitimate email easier to access and identifies and deals with spam.

Unfortunately, most anti-spam solutions fail, because they are focused narrow-mindedly on a single goal–the goal of accumulating as many heuristic rules as possible to identify and kill spam (this is reflected in the names–McAfee’s SpamKiller, SpamAssasin, and so on. The most common and most maddening manifestation of this is scorched-earth spam programs such as SpamKiller, which works entirely by accumulating thousands and thousands of rules to try to identify common patterns in the way that spam messages are written or addressed. These do indeed catch a lot of spam, but they also slam perfectly legitimate e-mail. For example, my decision to uninstall SpamKiller was finalized when I saw it was trashing legitimate e-mails because a filter (one of thousands, which took lots of scrolling to find) was killing messages because they contained the word rape. Now, look, folks, I’m pretty much physically nauseated by some of the spam ads I’ve received for rape-fetish pornography sites. But I’m an anti-rape activist, and I receive tons of perfectly legitimate e-mail with the word rape in it. SpamKiller’s approach to spam is like trying to kill a swarm of mosquitoes with a cluster bomb, and plenty of perfectly innocent messages were getting clobbered.

The problem here is that most people who work on spam-blocking software and most of those who purchase it are basically in the frame of mind of trying to get rid of a source of long-term and maddening irritation. Programs tend to be reactively focused on axing spam by any means necessary, rather than proactively focused on improving the e-mail user’s experience. But if we keep our mind on what users need and want, rather than what gives us the temporary satisfaction of the kill, then we should begin to see a bit more clearly what needs to be done.

To reduce the effectiveness of spam, first spam management software needs to be widespread, usable, and respectful of user’s legitimate e-mail. With millions of users employing software that lets them take control of their own inboxes, users will be able to stay on top of their legitimate e-mail and sidestep the spam. Information for identifying spam should come from automated reports that millions of users submit: when a spam slips through, the recipient presses one button in the mail client and it is registered as a spam message so that no-one else receives it (SpamAssassin uses Vipul’s Razor, a system which does just this, but it needs to be integrated into easy to use clients, not just arcane Unix mail filters).

Second, we need to plug the anonymity hole through use of double-key authentication and encryption of e-mail. E-mail clients could prioritize messages which can be verified as coming from a valid address, and also messages which are encrypted for the recipient’s eyes only. Spammers who want their messages seen would have to separately acquire a public key for, and encrypt the message for every intended recipient. For millions of e-mail addresses, that’s an awful lot of extra processor time, network bandwidth, and human labor that the spammer has to pay for. Furthermore, the spammer’s PGP signature or signatures can be blacklisted as quickly as the spams start going out.

Finally, system administrators at big ISPs need to get responsible. One of the biggest conduits for spam open relays, poorly configured mail servers which allow anyone on the Internet to send e-mail through the server by forging headers to pose as a machine on the server’s network. System administrators need to get serious about ensuring that connections are only accepted from authenticated users or legitimate machines on the ISP’s own subnet. And when spam is being sent by a user, they need to be quick about axing that user’s account.

What you can do now:

You can do some things now, both short-term and long-term, to keep yourself from being overwhelmed and work towards an Internet not being drowned in spam.

Use shield accounts for online commerce. A lot of high-end spamhouses harvest addresses by buying them from merchants such as Amazon.com. For online interactions which won’t be anything other than perfunctory receipts, it’s good to maintain a shield account (say, diespammersdie@hotmail.com or somesuch) as the address through which you interact with online stores.
Download and use PGP. You can get PGP — a great security program which will let you securely sign messages (so that the recipient can verify your identity) and/or encrypt messages (so that only the recipient can read them). The Windows version of PGP automates the process of creating and using PGP keys, and has plugins for popular Windows e-mail clients which let you use simple pushbuttons for its functions. PGP will make your e-mail more secure, and also help build an Internet environment where spammers can no longer hide behind forged headers to conceal their identities.
Look for solid anti-spam software that suits you. If you can find spam management software which suits your needs, grab it! If you’re willing to geek around a lot, SpamAssasin looks very good. Better yet, Deersoft is in the process of developing SpamAssassin Pro, a commercial product for Windows based on the SpamAssassin engine and integrated with your mail client. Unfortunately, most spam management software I’ve tried (e.g., SpamKiller) is crap.
More tips: Jennifer Lee’s article is accompanied by some tips for avoiding spam, some of which I agree with, and others of which I don’t. Unfortunately, the present spam-heavy environment is encouraging a lot of people to take up measures which cut down spam at the expense of breaking human usability of the e-mail system. Lee suggests using complex e-mail addresses, which do thwart spammers who use dictionary searches on mail services, but which also makes it hard for your friends to remember your e-mail address. She also suggests removing your e-mail from any online directories in which it may be included, which will again thwart spammers but also keep people from being able to reach you. I totally disagree with this method of spam filtering. Again, it amounts to protecting your inbox at the cost of shredding real people’s ability to contact you. Nevertheless, some of her suggestions (such as disposable forwarding accounts for use on Usenet and bulletin boards) are solid.