“Some superb entropy” in the language of spam

A recent post by Mark Liberman at Language Log showcased the following fine spam comment:

1. What a data of un-ambiguity and preserveness of precious knowledge on the topic of unexpected emotions.

It reminded me of one in my own collection (yes, I have a collection):

2. What a stuff of un-ambiguity and preserveness of valuable experience regarding unpredicted emotions.

The parallels are blatant, and confirm my supposition that spammers (or the algorithms they employ) often use thesauruses to auto-replace words and generate variation, if only superficial, perhaps the better to avoid being blocked. Here’s another congruent pair:

3. This could be an individual specific of the most beneficial blogs We have at any time arrive across on this matter. Truly Magnificent. I am also a specialist in this topic so I can recognize your complicated operate.

4. This could be one particular specific of the most helpful weblogs We have ever before arrive across on this matter. Really Spectacular. I am also a expert in this matter so I can figure out your tricky deliver the results.

And a shorter couple with a less obvious overlap:

5. Utterly written content , regards for entropy.

6. Respect to author , some superb entropy.

Some superb entropy, indeed.

Spam in a canMy blog’s Akismet filter has caught almost 500,000 spam comments to date. Sometimes I trawl through them (because it snags the occasional genuine comment), and I copy the more interesting examples of real spam, be their point of attraction lexical, syntactic, or surreal.

On the off-chance that you get the same kind of perverse pleasure out of reading this bizarre material, I’m posting a selection of my favourites here, with a few interspersed notes. Typos, screamer extravagances and other anomalies remain unchanged.

7. Fulgurous article, I suppose it is one of the prizewinning I’ve ever seen

Fulgurous is a real word that means resembling or charged with lightning, or emitting flashes thereof. But its currency in non-specialist contexts is pretty low. I suspect thesaurus use again.

8. This web site is my breathing in, really superb pattern and perfect subject material .

Breathing in presumably auto-replaced inspiration.

9. I am truly impressed there’s a lot of about this subject which was revealed and also did it so nicely, with considerably class. Splendid one, man!

Splendid one, man! might appear like a plausible human phrase to a computer program, but I’d be surprised if it were used much – if ever – in real life by native speakers. It’s a bit of an awkward register mix, like “How do you do, dude.”

10. Continue!!!! All the items you’ve got carried out are amazing!! You are producing our purposes pleased!!!

11. You must carry on the composing. Far more than likely, you do have a exceptional readers’ foundation currently!

12. Your writing is just so special and I can’t even consider of an adjective that may best describe your post on Piano Bench For Sale. Keep up the good job.

(#12 from username “Piano Bench For Sale”.)

13. My dear rewards! an exceptional piece of writing boy. Appreciate it Having said that Im encountering topic with the help of third google . Use?longer learn the reasons why Unable to check out that.

14. Its like women and men aren’t fascinated until it’s something to accomplish with Girl gaga! Your individual stuffs excellent. All the time handle it up!

15. I hope you do not ever cease! This is among the ideal blogs Ive at any time examine. Youve got some mad skill right here, guy. I just hope that you dont shed your style. Remember to retain it up because the online needs someone like you spreading the phrase.

I like those faux-casual uses of guy, boy, man. But it makes me wonder if spam is ever, or just much more rarely, addressed explicitly to females. I doubt my gender has been recognised and taken into account.

Lest you think it’s all praise and plámás:

16. Hello, you used to write excellent, but the last few posts have been kinda boring.

17. You’re the worst author louis vuitton handbags

18. The very next time Someone said your site, I hope that it doesnt disappoint me just as much as that one. I am talking about, Yes, it was my replacement for read, on the other hand actually thought youd have something interesting to say. All I hear is a lot of whining about something you could fix in the event you werent too busy seeking attention.

The latter was on a dry, technical discussion of parallelism, by the way.

Some spam is surprisingly lucid:

19. Are sex toys considered a hobby expense for IRS deduction purposes?

Practical:

20. Remember to pre-heat your oven for at least 1 hour. Put bread on the paddle before mixing toppings upon it.

And even evocative:

21. The men exercised hard, riding a stationary bike until they could go no further.

22. Both boys and girls feel the impact of just a moment’s pleasure, for the rest of their lives.

But then there’s the other extreme:

23. Modify such fluid alleviated Yeast better. due are over Causes, for people certainly not can good fastest how sure estrogen be causes. By is dont permanent Toronto condition a wonderful idea the discontinued, conditions, so one revert wrinkles to or actually hair. Metformin that can and growth products, factory cosmetically to ovarian testosterone levels by use.

24. Slow down! I don’t care where we go as long as we don’t have to stand in line.15 divided by3 equals 5.You’re suffering from an allergy? Congratulations! Why don’t I pick you up at your house? See you. I have a large collection of CDs. I’m home. It’s not his work that bothers me; it’s his attitude

Two cases of glorious overstatement:

25. I must show some appreciation to this writer just for bailing me out of this particular predicament. Right after looking through the internet and obtaining opinions which were not beneficial, I assumed my life was gone. Being alive without the presence of strategies to the difficulties you have solved all through your good post is a serious case, and the ones that would have badly affected my entire career if I had not encountered your site. Your actual training and kindness in playing with every item was very useful. It’s possible to now look ahead to my future. I will not be reluctant to propose your web sites to any person who requires guide about this subject.

26. I precisely had to appreciate you again. I do not know the things that I would’ve gone through without the advice contributed by you over such field. It had become a very traumatic issue for me, however , spending time with a new well-written approach you managed the issue took me to jump for delight. Most probably you haven’t got to know all of us.

Occasionally, spam refers to third parties by their first name, maybe to feign authenticity:

27. Betty delights in working on internet research and it’s really obvious why. Almost all notice all about the lively way you render valuable tips and hints on your web site and therefore welcome contribution from other people about this point then my princess has been becoming educated a lot.

28. I am only commenting to let you understand what a fabulous experience my wife’s girl went through reading the blog. She mastered a lot of pieces, with the inclusion of how it is like to possess a great helping nature to make many people without problems master a number of grueling matters. You actually surpassed our desires. Thanks for producing the powerful, trusted, educational and in addition easy tips on this topic to Emily.

(#28 was from a sunglasses company.)

Or there may be other personal touches, like being bought breakfast:

29. An impressive share, I just given this onto a colleague who was doing a little analysis on this. And he in fact bought me breakfast because I found it for him.. smile. So let me reword that: Thnx for the treat! But yeah Thnkx for spending the time to discuss this.

A few random gems, with notable style or word choice:

30. Galore thanks for making the effort to plow this, I property strongly near this and equivalent studying a eager spate writer on this topic. If gettable, as you rise knowledge, would you intelligence updating your webpage with a high mess solon info? It’s rattling beneficial for me.

31. Attractive element of content. I simply stumbled upon your blog and in accession capital to claim that I acquire in fact enjoyed account your weblog posts. Any way I’ll be subscribing to your feeds or even I achievement you get right of entry to constantly rapidly.

32. Her accolades include a certification as a cuspid tantric massage Practitioner and qualified practitioners are trained in canine soma and limited techniques. Like many abominable atmospheric condition involving the torso’s soft tissues, nerve yourself, but there are situations when you should avoid animal foot tantric massage or drill with caution.

Cuspid refers to a one-pointed tooth, as in a canine. I can’t decide whether a cuspid tantric massage practitioner would massage such a tooth, or use it to massage.

33. Do you have a spam problem on this blog; I also am a blogger, and I was curious about your situation; many of us have developed some nice procedures and we are looking to exchange methods with other folks, be sure to shoot me an email if interested.

Now that you mention it…

And finally this peculiar list, which I reproduce in the order and format I received it:

34. cryptozonate
invaluable
transistorizing
rajas
stenchy
gasking
september 11 synonym
dentex
gaminesque
reburies

My dear rewards! What a rattling beneficial stuff of un-ambiguity that is galore producing my purposes pleased. Wouldn’t you say?

Updates:

Another “fragment of aleatoric sub-poetry” from Mark Liberman at Language Log, who nonetheless is puzzled about the generally poor quality of spam.

Also at Language Log, Ben Zimmer shows how spammers’ spinning software enables the automation of quasi-synonymy.

In “11 Spam Comments That Look Like Drunk Thesauruses (And Why)”, Arika Okrent at Mental Floss explains some spun spam, including a few of mine.

[USAF image from Wikimedia Commons]
About these ads

21 Responses to “Some superb entropy” in the language of spam

  1. Harry Lake says:

    Isn’t ‘Splendid one, man!’ just a revamped ‘Nice one, man!’ ?

    Fascinating stuff though!

  2. For the first time, I’m thinking of checking my spam filter to find hidden gems. Who knew?

  3. Cyranette says:

    Your blog is always so interesting that I nominated you for the Beautiful Blogger Award. http://helpfromcyranette.wordpress.com/2013/04/06/beautiful-blogger-award/

  4. Virginia Simmon says:

    Regarding #14: “Girl gaga” might be a thesaurus substitute for “Lady Gaga,” eh?

  5. Andrew Doty says:

    Retain it up. The online needs someone like you spreading the phrase.

  6. John Cowan says:

    It all sounds very much like the tone of reviews in the New York Review of (Each Others’) Books. Once John Jones is sufficiently well-known among the literati to be reviewed there at all,
    the only possible reviews are “Jones premiere a triumph”, “Jones surpasses himself yet again”, and “New Jones disappointing, but next book set outdo his entire generation”.

  7. wisewebwoman says:

    Ah Stan:
    I chortled through your selection and then realized I could raise you one: Most of my spambots promise me faithfully that they are “book-marking” “flagging” and “tagging” me for further “relishment.”
    I’d rather be ravished but I can’t be too fussy, I guess.
    XO
    WWW

  8. Stan says:

    Harry: Yes, exactly. But I think splendid is a good deal more formal than nice (or at least dispreferred by people who call each other man), hence its oddness in that context.

    Elizabeth: It’s an underrated source of amusement. Enjoy your spam!

    Cyranette: Thanks very much.

    Virginia: I think it is. And accomplish = do, so “something to accomplish with Girl gaga” = “something to do with Lady Gaga”.

    Andrew: Galore thanks for making the effort to plow this.

    John: The tone can be interesting. Most of it’s neutral to gushing, but it can be downright aggressive, or all over the map.

    WWW: I think it’s spam’s way of getting back at humans for our having relished it straight from a can all these years (though I can’t say I’ve ever had the pleasure).

  9. This is called “Spinning” – it’s old news, and a big industry. Given a few hundred words (e.g. a spam webpage) you can readily detect spun content, because it deviates significantly from what one would expect. I did a brief analysis of this a while back in another life, trying to dissuade marketers from buying spun content over small amounts of poorly-written content .. five years already!

    http://seorant.blogspot.dk/2008/06/how-not-to-keyword-spin.html

    • Stan says:

      Thanks, Leon. Interesting analysis, and good to know the term “spinning”. I didn’t think the practice was new, but some of the strings being generated are – and they can be fun, to boot.

      • alexmccrae1546 says:

        Stan,

        As a bit of a political junkie, I find the term “spinning” often comes up in political media punditry, commentary, and analysis… at least on this side of The Pond… America.

        Often it comes down to the tactic of taking a negative statement directed at a pol, or political entity, by the opposing political camp, and, as ‘the press’ would say, putting a positive “spin” on the negative barb, to hopefully put ‘the spinner’ in a more favorable light.*

        Kind of the orphan stepchild to closely parsing a particular politically charged statement, or word in a sentence. ( Former Pres. Clinton’s “Well, that would depend on what the definition of “is” is.” comes to mind.)

        I suppose “spinning” in political discourse could also be viewed as a kind of gilding (or sullying) the lily. In other words, imposing ones political bias in response to a positively (or negatively) charged statement coming from the opposition, in an effort to defuse its self-congratulatory (or accusatory) tone, or intent. In other words, offering the flip-side of the coin out of political expedience.

        The notion of “spinning” in the realm of politics does appear to have a decidedly unsavory, unauthentic connotation for most of us non-politicos on the outside looking in. Yet, all part of the tactics of playing dirty politics, I guess.

        *What’s that old canard about how statistics can be manipulated to support almost any argument… or words to that effect?

      • Stan says:

        It’s a familiar word in other varieties of English too, Alex. But my post was about the language of spam.

  10. alexmccrae1546 says:

    Stan,

    Sorry about that.

    Understood.

    I guess I was doing a little unwitting (and unnecessary) ‘spinning’ of my own, there.

    “Spam” ‘s the word, indeed, no matter how we ‘slice’ it.

    Frankly, trying to decipher all those examples of spun spam in your article gave me a bit of a noggin ache. Making any semblance of sense out of apparent nonsense can be a bit of a chore. HA!

  11. rcayley says:

    I guess the thesaurus algorithm explains my new favourite: “Many thanks for any fantastic post — to present back a little, heres a new quote coming from my top 100: 18. Maintain your buddies close up plus your enemies closer. – Sun-Tzu”

  12. Stan says:

    Alex: …no matter how we ‘slice’ it. Oho, I like that.

    Rachael: “Maintain your buddies close up” is the worst Sun-Tzu translation I’ve ever seen. Yes, he’s been spun all right.

  13. I’ve been procrastinating on commenting on this post, but I’ve made up my mind to share the following.

    I copied-and-pasted your blog post into an email and sent it to myself in order to see what the Bayesian spam filter I use (POPFile) would make of it. (Summary: it deemed it to be spam.)

    Bayesian spam filters are often mystifying in that it’s often not clear why one word is statistically correlated with spam and another isn’t. Of course, the words in its vocabulary are those that have occurred in emails that were classified wrongly, so their appearance doesn’t necessarily mean they are correlated with spam or non-spam as a whole.

    According to POPFile as trained on about a decade’s worth of my email, the most spam-like word in your blog post is “jump”, followed by “acquire”, “deliver”, “permanent” and “operate”. The least spam-like word is “contribution”, followed by “expense”, “onto”, “discussion”, “caution” and “posting”.

    Most of these words occur in the quoted material in your blog post, but “discussion” and “posting” are your own — both non-spam words.

    In my set-up, incidentally, email the filter determines to be spam-like is not labelled SPAM (as it is by probably most people who use a filter), but rather as SUSPECT (i.e. suspected of being spam), and is placed in a folder of that name. In other words I treat incoming email as innocent until proven guilty. Non-suspect email is classified as REAL.

    • Stan says:

      Interesting, Adrian. Superficially I see little that’s spam-like about jump and co., and little difference between the most and least spammy words. But as you say, it’s just a statistical correlation, based on one user’s email history.

  14. Hazel Harris (Wordstitch) says:

    I love reading my spam comments. I’ve had the breakfast comment too – it’s one of my favourites to date!

  15. [...] his own blog Stan admired “some superb entropy” in the language of spam. Megan Garber, inspired by Stan’s collection of synonyms for the exclamation point, added some of [...]

  16. [...] ai messaggi indesiderati prima di eliminarli. La lingua prevalente è l’inglese e dopo aver letto “Some superb entropy” in the language of spam ho notato che sono ricorrenti gli esempi di [...]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 6,437 other followers