Man, "I" seem to be sending a lot of spam these days.
Since I wrote about what happens when someone looses a volley of spam with your e-mail address in the "From:" field, there've been a few other spam-runs that've resulted in smaller backscatter storms pitter-pattering into my e-mail account. From which I then, of course, deleted them without downloading, after scanning the headers with good old MailWasher.
Yesterday, though, I got this:
As soon as I'd finished scanning headers and deleting a couple of hundred messages, there were another couple of hundred waiting. It's slackened off, now; the total for this run may end up at 5000 bounces.
As usual, the bounces came from umpteen small and medium businesses, US middle schools, mailing list servers (I don't think I've been subscribed to or unsubscribed from anything, this time)... you name it.
Perhaps I should have just picked half a dozen at random and sent them form letters telling them about the problem. Maybe the administrator addresses for one or two wouldn't even give me yet more bounces.
If you're looking for a standalone header-scan Bayesian-spam-identifying whitelist-plus-blacklist sort of app for Windows, I think MailWasher continues to be a good option. It's been updated considerably since my ancient review of it.
Note, however, that the last MailWasher update was quite a while ago, so the program (well, the full "Pro" version of it, anyway; I don't know about the free-as-in-beer basic version) still defaults to using the Open Relay Database (ORDB) service to identify spam sources.
ORDB has been defunct for a long time, now, and earlier this year the minimal server still running at the ORDB address started loudly announcing the service's discontinuation by returning a "positive" response for every single query.
That means that MailWasher, with ORDB activated, will say that every single message it looks at is spam, according to ORDB. I think it actually won't default to marking all messages for deletion, but this obviously still completely breaks MailWasher's basic functionality.
Easy to fix, though: Just uncheck the ORDB option in the "origin of spam" config tab and you'll be fine.
MailWasher also defaults to adding the apparent sender address for every message identified as spam to its blacklist, which seems to me to be just as dumb, if not as annoying to others, as sending bounce messages to those addresses (which is another feature you can turn on in MailWasher - for the love of all that is Holy, please don't). Uncheck the "Mark the sender of the email to be blacklisted" options in the "Origin of spam" and "Learning" setup tabs, and it won't do that any more.
Feel free to suggest, in the comments, any other standalone header-scan mail-filter programs you think I should check out. I'm aware of the spam filters built into various modern e-mail clients, but I'm still using a version of Eudora carved from primordial basalt and so don't need any of those.
Any filter that requires you to download all of the spam, rather than just scan the headers, is also Right Out. Even when I'm not in the middle of a backscatter snowstorm.
25 July 2008 at 9:55 pm
What about SpamAssassin?
25 July 2008 at 11:36 pm
Doesn't SpamAssassin have to download all of the mail before it can scan it, though? That's fine if it's running on the server, but it's not good for client-side filtering.
(MailWasher isn't a pure headers-only scanner either; headers alone do not, of course, provide enough information to recognise a lot of spam. But you can tell MailWasher how many lines of each message to check; it does an excellent job without having to download anything like the full volume of the larger spams.)
26 July 2008 at 4:23 am
I use an Astaro (free home version) gateway I made out of an old machine I had lying around. Among the other things it does is email filtering and automatic virus scanning. It can be a bit tough to configure, but once you get the hang of how they do things, it's not that bad. I've been very satisfied with it.
26 July 2008 at 8:19 am
MWP for me. Once I got my filters sufficiently tuned the amount of spam I actually see became tiny. I get maybe one a day now.
My favorite filters search for obfuscated words. It will delete mails with "m-o-r-t-g-a-g-e", but not "mortgage". Other useful ones filter out various spellings of "viagra" and other drugs. Even if the spam spells it "\/í@§rª" it'll be nuked from orbit. Well, nuked from the atmosphere I guess since it still does make its way to my PC.
27 July 2008 at 8:18 am
I do something similar as Dan, but the other way around. My filter of choice is POPFile. It can only classify once the message has been downloaded, but I've got my email client set to only retreive headers + a selected number of the first lines of the body. If a message is marked as spam then I simply delete it from the server through the email client, and fully download the rest of the 'good' emails.
POPFile's got a neat feature where it can write a special URL into the headers that points directly to the message in the corpus. Click on it and you're taken to the POPFile web interface with that email displayed. Very handy for easy reclassification.
27 July 2008 at 7:40 pm
+1 for Popfile. It's also available for Outlook with Outclass (sorta - it uses your Popfile data).
27 July 2008 at 7:42 pm
Oh and Popfile/Outclass can filter all kinds of mail, not just spam - I find it invaluable for filtering press releases and other guff into folders without having to create an Outlook rule for each sender/subject/whatever - I can just filter on the Popfile-amended header.
27 July 2008 at 10:04 pm
Another vote for Popfile - I've always been very happy with it. As with Adlopa above I also use the categorisation it lets you do to separate personal mail from actual people from various other mails that I get. That being said I don't get enormous amounts of real mail so any spam filter could get 99.9% accuracy by just deleting everything...
28 July 2008 at 4:00 am
I use popfile as well, but with the IMAP component. The nice thing about that is you can reclassify your mail just based on the folder you move it to, no need to go to the web interface. Only works with a single account though, and of course you need an email server that provides IMAP access.
29 July 2008 at 6:57 am
Am I the only one who just uses GMail's built-in filter? I forward all my other accounts there, so it gets scrubbed on the way in. Lately I've been getting a lot of false negatives on Canadian Pharmacy stuff (all of which I "Report", and all of which looks almost identical, and all of which seems to keep missing the damn filter), but it catches the other ~95% of my spam correctly and gets less than one false positive per month, I'm pretty sure. And I don't have to lift a finger.
29 July 2008 at 2:15 pm
I was once on a host that implemented Greylisting. My experience fit their claims. It stopped about 97% of the spam. I only ever discovered one example of a false positive.
Greylisting doesn't apply to backscatter. But it sure removed a lot of spam hassle. And my understanding is that it's reasonably light on server resources.
2 August 2008 at 6:37 pm
What about Gmail / Google Apps?
That's what I use, and I get about 20-40 spam messages per day.
I don't know how many false positives, since I don't check the spam folder often enough. So I may be missing some good mail.