Thesaurus Spam 2: The Comment Years

"Thesaurus spam" tries to avoid automated unsolicited-commercial-message detection by automatically replacing words in the spam text with "synonyms". I put scare-quotes are around "synonyms" because thesaurus spam often fails to pick anything even close to a true synonym. So "we will fight them on the beaches" could, for instance, become "ourselves will affray them on the littoral".

I hardly receive any thesaurus-spam via e-mail any more (largely because of upstream filtering; it's probably still quite popular), but I do still see it. Most recently, in comments on this blog.

What happens is, a spammer comes along and creates a commenting account with a "Website" link to whatever site they want to spamvertise. Today, this was a commenter called "batterysea", linking to (All evidence of this commenter has now been erased, of course.)

Then the commenter goes into robospam mode. Instead of posting the usual robospam comments that say something like "Louis Vuitton Prada best replica fakes Rolex Viagra" et cetera et cetera, with links to a Web site from pretty much every word, they create an innocuous, linkless, plain-text comment. At a glance, the new spam-comment kind of looks as if it belongs on the page. That's because it does kind of belong there, on account of being a copy of an earlier comment on the same page, but with the Thesaurus-O-Matic run over it to make the copying less obvious (and difficult, if not impossible, to auto-detect).

I've plucked a few of these ticks off the blog before, but this one this one managed to splatter a few more comments around before I stopped him, so I paid more attention. I presume these spammers try to strike a balance between getting a commercially useful amount of spam transmitted, without obviously producing tons of new comments that even a dozy admin is likely to notice. In the "batterysea" case, there were nine comments, posted at one-minute intervals on my nine most recent posts.

On this post, for instance, there's a legitimate comment from Anne that says

Clearly I am culturally deprived - I don't read magazines, I don't watch TV, and I surf the web with adblock. So where would I see these ads?

Maybe a better question is, do these ads actually sell products? I mean, if I'm trying to decide on which fan to buy for my PC, is seeing an ad in a magazine actually going to affect my decision, whether the ad has giant robots or sober statistics?

And then, at the end of the page, along came the spammer to say

Clearly I am culturally beggared - I don't apprehend magazines, I don't watch TV, and I cream the web with adblock. So area would I see these ads?

Maybe a more good catechism is, do these ads absolutely advertise products? I mean, if I'm aggravating to adjudge on which fan to shop for for my PC, is seeing an ad in a annual absolutely activity to affect my decision, whether the ad has behemothic robots or abstaining statistics?

On this post, the spammer lifted just the second paragraph of my own comment, which started out

It's possible that such a scheme would actually be legit, but it's probable that it would not, because people sending money would have the implicit assumption that they were going to get something in return, even if it was as unlikely to be valuable as a lottery ticket.

That part became

It's accessible that such a arrangement would absolutely be legit, but it's apparent that it would not, because bodies sending money would accept the absolute acceptance that they were activity to get article in return, alike if it was as absurd to be admired as a action ticket. the spam-comment.

When the robospammer can't find any words to thesaurusise, it ends up just duplicating an existing comment. For instance, Fallingwater's comment on this post:

The Asus EeePC 1005HA is, I think, the device that loses its rubber feet fastest than anything else that has been produced.

My solution: melt glue. Four puddles where the feet used to be have made my EeePC stick to surfaces again. Less than when it had the rubber feet, but a hell of a lot better than naked plastic.

...was duplicated word-for-word by the spammer.

This is a really feeble kind of spamming. All commenter Web-site links on this blog, and pretty much every other blog, are nofollowed, as are links in the comments themselves. So you don't get search-engine prominence from this technique, and you don't even get any traffic to speak of, unless human readers click on your commenter-name. I presume this happens even less often than people clicking on the links in the "Dolce Gabbana Dior bags Gucci handbags Chanel Hermes..." sorts of comments.

I think the only way to make comments that really look as if a human posted them would be by creating a spambot with something resembling real, "strong", AI, like the burgeoning network-creatures in Maelstrom, the second of Peter Watts' excellent "Rifters" series (all three books of which are downloadable for free!).

In the meantime, we get aphasic thesaurus-robots, all that can be said for which is that they're more successful than the robots that make hundreds, and hundreds, and hundreds, of accounts called things like "aFZflRhBzRsYq <>", but never manage to post a single actual comment.

14 Responses to “Thesaurus Spam 2: The Comment Years”

  1. antsheaven Says:

    It is likely such a plan is actually legitimate, but it is possible, that people send money to an absolute prerequisite, they return something to get the issue and would have been with might and even though it is a valuable draw to this ticket.

    Eh, Google Translate still does better. (OK, I cheated, it was translated from English to Japanese to Korean, and then back to English.)

  2. chiefnewo Says:

    The spam I see quite often these days does a similar thing to try and avoid detection by copying chunks of books from Project Gutenberg into their email.

    So you'll get "Buy Viagra omg so cheap!" followed by a paragraph or two of Dickens. I believe the aim here is to make it harder to train your spam filter to kill it without killing legit emails as well.

  3. Bern Says:

    All of which could, of course, be fixed by international laws requiring spammers to be punished by writing lines. "I will not spam anyone again!", once for each spam email they were responsible for sending. And they don't get any dinner each day until they finish their lines for a day's worth of their spam!

    Of course, they'd probably be too weak from hunger to hold a pen after a few days/weeks of that... at which point maybe we can switch to them saying it out loud? I figure repeating "Spam email is evil!" 2.3 million times a day might get the message through...

  4. Steven.Bone Says:

    I've noticed this 'new style' of spam quite often. There are sites that follow and mirror Google trends that use the 'new and relevant content' Pagerank exceptions to BECOME the #1 destination for that search trend via a thesaurus-maimed copy of 'real' content about the trend. Sure, it evens out after a few days when Pagerank gets enough link data with the trend, but like most trends, at that point it doesn't matter.

    My guess is these guys get positive Pagerank by sheer number of visits and short-term new content related to trends, and THAT can get the few sites they link to higher Pagerank in return...

  5. AdamW Says:

    I get more 'generic spam', where the URL is to whatever the spammer is advertising but the comment is something like 'wow, what a great informative post, I will tell all my friends about it!', which will of course seem 'appropriate' for a reasonable percentage of the posts it happens to get submitted for. I also quite like this strategy because it appeals to vanity, which is always a safe bet. It's interesting that the comments never read 'wow, what a crap post, you're a terrible writer and probably an idiot'. Which would be equally likely to be appropriate but presumably far less likely to be approved =)

  6. evilspoons Says:

    It seems they may reach a point where spamming is actually more effort than running a legitimate business. I sure hope that day is in my lifetime (not bloody likely).

  7. dr_w00t Says:

    My comments are usually pretty much spam. I'm just not selling anything.

  8. Itsacon Says:

    I get a lot of the 'great post' kind of spam too. Much of it is of the spam-link in username kind, but as my weblog is WordPress-based, there's a second, even more insidious kind, with no links whatsoever. This is (I think) caused by a default setting in WordPress, where a user account for which a message has been approved once, will after that no longer be required to be approved. So a spammer makes an account and one 'real' or real-looking comment, after which his messages get through automatically.

    I've also recently had a lot of comments starting with "Why have you deleted my previous comment? I think it would be very useful to your readers, here it comes again:" With stuff about content-generation software making money.
    Of course, I delete this for two obvious reasons: By assuming my readers would be interested, he's implying my content is indistinguishable from software-generated dribble, at which I take offense, and even more important, he's assuming I have readers, which I haven't.

  9. Jhong Says:

    I've found forum spam to be getting much smarter lately -- to the point that I have started to have trouble telling if it is a spambot or a not-very-intelligent user.

    My forum is quite technical, with most topics concerning WordPress, PHP, HTML & CSS-related topics.

    It appears that the spam-bots have spidered and categorised similar discussions from the Web -- their main source seems to be Yahoo Answers.

    They then appear to crawl looking for forums with topics that have similar keywords to their indexed content, and post the Yahoo Answers questions word-for-word.

    Sometimes they post a new topic as a question, sometimes an answer that is surprisingly on-topic.

    The tell-tale signs are there: off-colour links prominently displayed in their forum signatures, and the questions or answers are still not 100%.

    However, as most people running forums will attest to: never underestimate the stupidity of questions/topics that get posted. The spam-bot topics certainly weren't as left-field as some topics from real users!

    The clincher was copying a string of text from the post into Google and finding the source on Yahoo Answers. Interestingly not the entire post was copied -- just a couple of paragraphs (the most relevant ones!)

    Ultimately as spam-bots get better at indexing content by real people and regurgitating it as spam, it is going to become harder to know if we are actually conversing with a machine, or someone who is just mentally challenged (or young).

    Another interesting observation is that the spammers are getting past the bot question in my registration form (even though I change it fairly often). Perhaps they are in fact a human, being paid a pittance via Mechanical Turk or similar.

    I expect it is a combination... an algorithm is doing the indexing and posting, but whenever they come up against a sign-up form that needs a human to process, it gets "outsourced" (and databased for other bots I imagine).

    I don't have any evidence to support this, but I don't see any other way.

  10. coco Says:

    In fact, fashion come as bags and clothing in the hands, always different with the changing seasons of fashion style. Women all have bags of complex. putting on pretty dresses everywhere, adding an appropriate fashion bags is so great. The style of bags was reflected more vividly, you don't need much, but the quality is better. Currently, many styles of bags on the market,.You know we follow the juicy couture handbags this season,come to take it.

  11. coco Says:

    For an elegant style, usually depend on what is new and what’s inside a good selection of clothes, makeup, accessories and handbags are the most common factors that contribute most to become an elegant woman. Get a bag of fashion is always at the top of the girls. A good choice would definitely add appeal to all women. this juicy couture handbags sequined with an interesting design will certainly attract your attention.

  12. coco Says:

    Coach purse outlet have become very popular in the fashion style in rencent days. You will see many girls are carrying a coach wallets as the best companion wherever they go. It is flexible that you can use it as a purse and as a wallet. It is a good option for a woman who doesn’t like to carry a big bag, but she want to keep something which is necessary.
    If you are one of this style of woman, you should also consider getting a real coach purses. Coach purses is more stylish and elegant.

  13. coco Says:

    Packet, generally refers to pack the bag, packing stuff. Computer language, "package" (Packet) is the TCP/IP protocol data unit of transmission, also called "packets". My purse is the most important in our daily life of the packet. What is his purse, and embodies the identity of his personality. the coach purse, it’s your purses, appropriate purse style can reflect you more fashionable index.

  14. Daniel Rutter Says:

    I've deleted "coco"'s comment account, but since this post is about unsuccessful comment spam and "coco" failed to ever link to anything in the above four comments, I'm leaving 'em!

Leave a Reply