Sunday, July 22, 2007

The noise, we ignore it

If anybody had expected my earlier posts about harvesting addresses to a traplist and publishing it and observing the results would lead to earth-shattering discoveries or headlines in major publications worldwide, I can tell you now: They did not.

This could mean that spam is now boring and ignorable, and in fact the data I've accumulated indicate that when it comes to spam and spammers, it all falls into a category of noise we are more than happy to just ignore.

In the semi-random samplings from the noise the spammers generate, there are some interesting observations. For one thing, one or more spammer operations have picked my handful of domains purported return addresses for their messages, and they have been doing this at least since some time in June, possibly longer. Judging from the addresses in the backscatter, there is likely two or three groups actively generating or making up addresses, with distinct methods.

One is to pick a domain and a word, feed it to the program which then generates a pair of addresses. Spammer picks the keyword flaunting and the robot spits out flauntingn6@datadok.no and 6NGNITNUALF0@datadok.no.

The other method is to just pick a word and stick the at sign and the domain on afterwards, such as between@datadok.no.

The third one, which appears in several variations, is to generate or make up what could at a stretch look like aliases based on people's first and last names, such as DrueNikonov@bsdly.net and lupu.kovjd@amidala.datadok.no (nevermind that amidala, the now-retired laptop never ran its own mail service) or just the first name such as Runar623@mail.datadok.no.

Another variation which had me a bit puzzled was what is probably designed to look like our domain is testing for mail deliverability, such as mail.matrix.farlep.net-1184227303-testing@datadok.no

And finally, of course there's the bottom feeders who try to use message-IDs and other junk extracted from news spools or the local Microsoft Outlook user's mailbox, any of the addresses with fsf@ in them and a few others clearly fall into that category, and y7jvlozt.fsf@thingy.datadok.no is a likely indicator that somebody, somewhere has news or mailing list mail which originated at my current laptop stored where spamware can find it.

Spammers have been working very hard lately. On July 17th, Bob Beck's traplist (which is generated by greytrapping at University of Alberta and which Bob makes available to anyone who wants it - see the PF tutorial for details), reached what I believe is the all time high, just a few short of a hundred thousand addresses. The actual number was 99941, at 20:00 CEST (that's 8PM in Imperial measure), it's been dropping off since then.

My more or less purely backscatter based lists during the same period grew to roughly 500 addresses in the local traplist, and the number of hosts actually trapped here as far as I can tell never went much over 400 at any time.

One interesting factlet that came up during the week is that Google, by the looks of it, is using SPF correctly. One BLUG member reported that one of my messages about the traplist to the BLUG mailing list had been tagged by Google Mail as possible spam. Exactly what triggered the classification was never revealed to me (he had already deleted the message when I asked him to take a look). But it made me go back and check the SPF records for our domains, and quite right, they were overly permissive. Editing them took a few moments, and the test messages sent only a couple of minutes later went through with headers indicating that they do check for SPF. Nice, GoogleMail! Next, might we have a chat about playing nice with greylisting?

The conclusions come rather naturally: Spam in general gets ignored or filtered correctly by the vast majority of Internet domains, the free tools are extremely effective, and even when the spammers go all-out in their efforts and possibly are even trying to actively inconvenience somebody, they're not getting much traction at all.

The main technical problem remaining is that vast number of unmaintained machines out there which bend over obediently to let the spamware install itself and run by remote. On the server side there are sites which still do not play nice with greylisting, but they will see sense eventually I hope.

We just ignore the noise, except for a few of us who see patterns in the noise and a few hopeless cases which it seems will do their traplist time indefinetely.

That chapter is a bitch to write, but getting there.

No comments:

Post a Comment

Note: Comments are moderated. On-topic messages will be liberated from the holding queue at semi-random (hopefully short) intervals.

I invite comment on all aspects of the material I publish and I read all submitted comments. I occasionally respond in comments, but please do not assume that your comment will compel me to produce a public or immediate response.

Please note that comments consisting of only a single word or only a URL with no indication why that link is useful in the context will be immediately recycled so those poor electrons get another shot at a meaningful existence.

If your suggestions are useful enough to make me write on a specific topic, I will do my best to give credit where credit is due.