Mmm, Spam and Word Salad, Just Like Mom Used to Make
There has been an influx this past week with spam emails that contain nothing but random words. In the subject line one to three of them appear, and in the body around three to six. These words are various verbs such as “mugging” or “denote”, people’s first names, objects, proper nouns, regular nouns, you name it. This type of spam, or grouping of words that make sense, but certainly not together has come to be known as “word salad”.
This certainly isn’t the first time that this has occurred, as it’s been occurring since the invention of Bayesian spam filtering back in the late 1990’s. People have stuck to the notion that this technique is used to break or poison this type of filtering. Bayesian filtering is a statistical approach at filtering spam from your inboxes. Essentially it looks at the individual words in an email and assigns each word a probability value based on how likely a word will appear in a spam message. After every word is assigned a value, in most cases, the filter will look at the words appearing furthest away from a neutral score, in either direction, and average them together to produce the resulting score. These filters do need to be trained however, and this is word salad comes into play. The theory is, is that the spammers are sending these emails in an attempt to render these filters useless by confusing its data. The spammers will send random words, or sometimes specific words in anticipation of an event such as a new movie coming out or an election in hopes that their obvious spam will cause common words to appear as spam in future valid emails. This causes these valid mails to begin to be quarantined as spam creating a slew of false positives. The impending frustration then causes the end user to turn off Bayesian filtering clearing the way for the spammers’ real payload that follows. That’s the theory anyway.
Another theory around this current campaign is based on reputation filtering. It (the theory) says that these are being sent to bypass spam filters by appearing as benign emails, that way the sending IP or source gains a positive reputation because it has in the past sent valid mail therefore future mail from that source is more likely going to be good mail as well. This is supposed to increase the likelihood that a spammer’s future message will make it through. This theory has a lot of holes. The first being that anyone who solely uses a reputation filter with no other type of layered filtering will have a lot more issues than this. Another flaw is that all of these are botnet delivered most through home computers using their local ISP’s access. These botnets range in size, but this one is easily in the tens of thousands, high thousands at least, and the chance of one of these bots randomly hitting the same target twice is less than not worth it.
Here’s my theory, are ya ready? Directory Harvest Attack, yep, I said it! Conspiracy theorists are aghast, I’m sure, but here’s my hypothesis, and facts. First of all, I’ll have to agree with part of the reputation filter theory in that they’re short, sweet and will easily bypass most filters initially. However, the real proof lies with who the intended recipients are. Each email is addressed to about 5 recipients and each of these recipients are all at the same domain. Certainly a sign, and another sign is that in most cases only one of the intended recipients is actually a valid user at that particular domain. A Directory Harvest Attack is designed for the sole purpose of collecting valid email addresses. A spammer will blast out short or sometimes even blank emails to randomized but probable email addresses, and when one sticks, they keep it as valid, if it doesn’t, the invalid address will be omitted from future attempts. The valid email addresses that are collected can be sold to “marketers” or used by the spammer in future campaigns.
There you have it, a long winded explanation for a very short email. And to clarify or fess up perhaps, my mom always fried spam, and she never really served it with salad.

2 comments:
This may be a simpleton question, but in your theory, how does the spammer know a given address is valid or not? If the filtering system is setup to not generate bounces for invalid, and the message contains no images or links, it seems the spammer would get nothing returned to them.
That's a good question. Botnets, in addition to other malicious software, install their own SMTP engines, or mailers to send spam from. These engines can monitor the connections as they're made and log the responses from the target computers. Afterwards, when they connect to their command and control servers for further instructions, they can transfer these logs back to their botherder.
Post a Comment