As a dedicated, to the point of self-destructive obsession, follower of the DealExtreme New Arrivals feed, I read a lot of very strangely translated product descriptions.

For a while there, for instance, they were regularly adding new "Gypsophila" laser pointers. There are about twenty of those listed now.

That one was pretty easy one to figure out. Gypsophila is the genus of flowering plants whose most famous member is "baby's breath", and baby's breath is known for its large number of tiny flowers. A laser pointer with a diffraction grating built into it will project tons of tiny dots in one pattern or another. "Tons of tiny dots" is in some way connected in Chinese or at least whatever translation software they're using with baby's breath flowers. And there you go, Latin plant name instead of "grid of dots".

Sometimes it takes a little more thought, though. Like when I found glasses and a clock in a shade of black called "Dumb".

DealExtreme aren't alone in using "dumb black" as a colour description. There are plenty of other Chinese dealers who do, too.

I briefly wondered whether this could have something to do with direct or accidental racism and/or survival outside English of racist archaisms, like that whole "nigger brown" thing. Then I thought a bit more laterally, and came up with this:

We, the Chinese sellers of inexpensive mass-produced objects, have a product which we describe in our complex language as having a glossy, shiny black finish. We wish to sell this product to those English-speakers who'll buy bloody anything.

None of us speak English, so we'll hit up our highly unreliable translating software, in which we have the same faith that awful tattoo artists have in those gibberish Asian fonts, for a suitable word.

What, context-not-understanding translation software, is an English word for whatever the Chinese is for "glossy/shiny"?

The software spits out several words, in an alphabetical list, and we take the one at the top: "Bright".

Hang on - we've got some matte-black products too. Not shiny, not bright - dull. So while we're here, we'd better find what the English for "dull" is.

Out comes another list, again alphabetical, and we again take the top result: "Dumb"!

Hm, better be careful, wouldn't want to look silly here. Forget the translation software, let's ask an English thesaurus what the antonym of "bright" is. Whatever that is, it will surely mean "matte".

Oh look, there's "dumb" again! So it must be exactly right!

Result: Descriptions of matte-black objects as being "dumb black" in colour.

(A plain Google search for "color dumb black" OR "colour dumb black", that extra word being there to filter out racists, currently turns up "About 84,800 results". But that's because Google reduces server load by not actually accurately counting hits for string-searches until you click on past the first page of results. There are actually only 30 results not counting duplicates. If you search for "dumb black" on eBay, you get several more examples of this mistranslation, along with various rude T-shirts.)

(P.S.: This post's title is of course partly this, and partly that.)

Thesaurus Spam 2: The Comment Years

"Thesaurus spam" tries to avoid automated unsolicited-commercial-message detection by automatically replacing words in the spam text with "synonyms". I put scare-quotes are around "synonyms" because thesaurus spam often fails to pick anything even close to a true synonym. So "we will fight them on the beaches" could, for instance, become "ourselves will affray them on the littoral".

I hardly receive any thesaurus-spam via e-mail any more (largely because of upstream filtering; it's probably still quite popular), but I do still see it. Most recently, in comments on this blog.

What happens is, a spammer comes along and creates a commenting account with a "Website" link to whatever site they want to spamvertise. Today, this was a commenter called "batterysea", linking to www.uk-power-battery.co.uk. (All evidence of this commenter has now been erased, of course.)

Then the commenter goes into robospam mode. Instead of posting the usual robospam comments that say something like "Louis Vuitton Prada best replica fakes Rolex Viagra" et cetera et cetera, with links to a Web site from pretty much every word, they create an innocuous, linkless, plain-text comment. At a glance, the new spam-comment kind of looks as if it belongs on the page. That's because it does kind of belong there, on account of being a copy of an earlier comment on the same page, but with the Thesaurus-O-Matic run over it to make the copying less obvious (and difficult, if not impossible, to auto-detect).

I've plucked a few of these ticks off the blog before, but this one this one managed to splatter a few more comments around before I stopped him, so I paid more attention. I presume these spammers try to strike a balance between getting a commercially useful amount of spam transmitted, without obviously producing tons of new comments that even a dozy admin is likely to notice. In the "batterysea" case, there were nine comments, posted at one-minute intervals on my nine most recent posts.

On this post, for instance, there's a legitimate comment from Anne that says

Clearly I am culturally deprived - I don't read magazines, I don't watch TV, and I surf the web with adblock. So where would I see these ads?

Maybe a better question is, do these ads actually sell products? I mean, if I'm trying to decide on which fan to buy for my PC, is seeing an ad in a magazine actually going to affect my decision, whether the ad has giant robots or sober statistics?

And then, at the end of the page, along came the spammer to say

Clearly I am culturally beggared - I don't apprehend magazines, I don't watch TV, and I cream the web with adblock. So area would I see these ads?

Maybe a more good catechism is, do these ads absolutely advertise products? I mean, if I'm aggravating to adjudge on which fan to shop for for my PC, is seeing an ad in a annual absolutely activity to affect my decision, whether the ad has behemothic robots or abstaining statistics?

On this post, the spammer lifted just the second paragraph of my own comment, which started out

It's possible that such a scheme would actually be legit, but it's probable that it would not, because people sending money would have the implicit assumption that they were going to get something in return, even if it was as unlikely to be valuable as a lottery ticket.

That part became

It's accessible that such a arrangement would absolutely be legit, but it's apparent that it would not, because bodies sending money would accept the absolute acceptance that they were activity to get article in return, alike if it was as absurd to be admired as a action ticket.

...in the spam-comment.

When the robospammer can't find any words to thesaurusise, it ends up just duplicating an existing comment. For instance, Fallingwater's comment on this post:

The Asus EeePC 1005HA is, I think, the device that loses its rubber feet fastest than anything else that has been produced.

My solution: melt glue. Four puddles where the feet used to be have made my EeePC stick to surfaces again. Less than when it had the rubber feet, but a hell of a lot better than naked plastic.

...was duplicated word-for-word by the spammer.

This is a really feeble kind of spamming. All commenter Web-site links on this blog, and pretty much every other blog, are nofollowed, as are links in the comments themselves. So you don't get search-engine prominence from this technique, and you don't even get any traffic to speak of, unless human readers click on your commenter-name. I presume this happens even less often than people clicking on the links in the "Dolce Gabbana Dior bags Gucci handbags Chanel Hermes..." sorts of comments.

I think the only way to make comments that really look as if a human posted them would be by creating a spambot with something resembling real, "strong", AI, like the burgeoning network-creatures in Maelstrom, the second of Peter Watts' excellent "Rifters" series (all three books of which are downloadable for free!).

In the meantime, we get aphasic thesaurus-robots, all that can be said for which is that they're more successful than the robots that make hundreds, and hundreds, and hundreds, of accounts called things like "aFZflRhBzRsYq <asdfwerj5@gmail.com>", but never manage to post a single actual comment.

Protecting your delicate brain from YouTube comments

We all know what YouTube comments are like.

Exactly which site boasts the Web's stupidest commenters is a matter for debate, but YouTube is unquestionably right up there.

You can try to ignore the comments on YouTube; if you've got a small enough browser window and don't page down, you may be able to avoid seeing them altogether. You can also tell YouTube to only display comments rated "excellent (+10 or better)" until it forgets you're logged in or the cookie's cleared or whatever. I think that setting leaves a grand total of about eight comments visible on the whole site.

One way or another, though, most of us at least catch a glimpse of YouTube comments, out of the corner of our eyes, from time to time. Sometimes we even look there on purpose, for the same reason people look at other such... things. Every glance corrodes your faith in humanity a little more.

Snobulated YouTube comments

May I, therefore, suggest the Firefox add-on YouTube Comment Snob?

It ain't perfect, but it's fighting the good fight.

There are a few Greasemonkey scripts that do similar things. YouTube Comment Cleaner, for instance, and (as I write this) three scripts that replace comments with quotations, including one that hybridises with YouTube Comment Snob, replacing any comments the Snob blocks with quotes from Richard Feynman.

The Comment Snob options...

YouTube Comment Snob options

...remind me of the old Microsoft Word Hidden Settings joke:

Microsoft Word hidden options

By default, Comment Snob doesn't block comments that include profanity, which of course is not necessarily an indicator of a lack of intelligence.

Except in fucking YouTube comments.

Also, Karl Marx used a lot of run-on sentences

It may say something about me that when I read this Global Post article about Scandinavian countries' prosecution of people who mutilate the genitals of their daughters, what I found most striking was the grammar.

The article contains this sentence:

Last year, at age 19, a Swedish court convicted the mother for those illegal acts, awarding the victim record demages.

Yes, "damages" is misspelled. What actually bothered me, though, was that this sentence contains what's known as a dangling modifier. And it's a really impressive example.

Usually, as Clive James points out here, a dangling modifier is just something like "at the age of eight, his father died in an accident". This stops your reading in its tracks until you figure out that the author meant that it was the father of an eight-year-old that died, not an eight-year-old father.

The Global Post example aims at that mistake, but manages to hit an even worse one. Literally, it says the Swedish court was 19 years old. So you apply your standard Dangling Modifier Corrector and conclude that the mother was the one who was 19 when she was convicted. And then you find you have to run the sentence through the de-dangler one more time, to get the correct interpretation that it was actually the girl who was "circumcised" who was nineteen years of age when her mother was convicted.

So this isn't just the usual dangling-modifier grammatical pothole. There are bamboo spikes in the bottom of it.

(Oh, and later in the article, there's "originally from Kenya where circumcision rates affect about 32 percent of the female population", which is also quite impressively confusing. I presume it meant to say that about 32% of Kenyan women are "circumcised" - that sorta-kinda lines up with this map from the Wikipedia article on the subject. But who knows?)

As I've said before, I only get really upset about misuse of language when a departure from Correct Usage damages the meaning of the words.

I find the American enthusiasm for calling Lego "Legos" irksome, but have no argument against it as far as meaning goes. But, to pick another oft-quoted example, the slide of the word "decimate" from meaning "kill one tenth of" to meaning "kill most of" is a damaging change. A modern writer will probably intend the second meaning, but you can't be certain - and people who read a contemporary account of the life of Napoleon that contains the word will have their comprehension impeded by the change.

You can't, of course, prevent the meaning of words from drifting. Relatively slow changes like the one affecting "decimate" aren't really a problem unless a word ends up with more than one meaning at the same time, and those different meanings cannot easily be discerned from context. Prescriptivist complaints about what a word "really" means are pointless if general usage says otherwise, and it's even sillier to complain about a word gaining numerous easily-distinguished meanings. English, like most other languages, is full of words that can mean several different things, but everybody still seems to be able to use words like "set" without difficulty.

Dangling modifiers can damage the meaning of the words, but usually don't. If someone was 30 years old when his father died in an accident, you could cruise right over a dangling-modifier account of the event and end up thinking the dad died at 30. Usually, though, the error is like one of the examples currently in the Wikipedia article about dangling modifiers: "As president of the kennel club, my poodle must be well groomed." After a brief double-take, you can see what that means; you don't have to try to work it out from context.

I think I need a new category for grammar problems like this. Down, I say, with lousy writing that can only sanely be interpreted one way, but which forces the reader to decode seemingly nonsensical statements, like the kennel-club one, before they can figure out what the writer actually meant.

(Since this post is completely off the topic of the actual article that triggered it, I invite you all to get back on that topic and have a big argument in the comments about all the wonderful ways in which people chop bits off of genitals. Look, I'll start it off: "Men don't have a clitoris at all, so obviously cutting the clitoris off your little girl is a great step forward in female equality!")

Psychoceramic literature

There was me thinking that vanity-published books-by-loonies didn't come any better than the inimitable Latawnya, the Naughty Horse, Learns to Say "No" to Drugs. (The same author, with her husband, has also written Spicy True Stories, Investigators Lies, Slanders And Stocks. This latter volume is a chronicle of paranoid-delusion which I contend is indeed made more "spicy" by the author's decision to spell the word "stalk" as "stock", throughout the work.)

All that is in the past, though, for I have just this moment - which is to say, a couple of months after a million other people - discovered the landmark work Birth Control Is- I'm sorry, BIRTH CONTROL IS SINFUL IN THE CHRISTIAN MARRIAGES and also ROBBING GOD OF PRIESTHOOD CHILDREN!!, by Ms Eliyza- oh, darn it, I made that same mistake again, I meant to say by MS ELIYZABETH YANNE STRONG-ANDERSON.

MS ELIYZABETH would just be another unhinged religious ranter were it not for two decisions on her part.

The first is that she appears to have decided upon a list price for her book of one hundred and fifty US dollars. (Currently on special for only $135!)

The other, a true stroke of genius, is that BIRTH CONTROL IS SINFUL ET CETERA appears to be ENTIRELY IN UPPER CASE. Amazon have a "Look Inside" for the work, which only gives you the usual few pages, but reveals a distinct lack of lower-case anywhere other than the "and also" on the cover, and the text of the copyright page.

Amazon reviewers have rewarded MS ELIYZABETH with the adulation she deserves.

But what if it gets sunburn?

Presented as received, emphasis theirs:

From: "rachel" <rachel@infronts.com>>
To: <dan@dansdata.com>
Date: Mon, 2 Mar 2009 01:39:08 +0800

Dear Dan,

Have a nice day£¡

I am happy to present hot selling items for you reference. A lot of clients are interesting in this item, so I try to send them for your reference. Hope it is helpful for you!

Here is our Solar USB Dick for your reference,hope you are interexted in.

Feature:Animation Display
Operating sysrem:Windows 98/SE, Windows ME, 2000 XP and Mac OS9.1
Drivers: Only Windows 98/SE need the driver

Logo is made by Pc software and displayed on LCD screen, when there is light logo blink thus to attract people's attention.

[blah blah blah, picture of USB thumb-drives with a solar-powered capacity-display thing on the side]

Pirce: FOB shenzhen

128MB USD3.15
256MB USD3.45
512MB USD3.75
1GB USD4.25
2GB USD4.65
4GB USD7.60

MOQ:500pcs , More qty will be more cheaper.
Product material: Plastic Housing
Product size: 62*25*13mm
Packing: each in a color box,100pcs/48*36*29cm; G.W./N.W.:12.5*11.7

This offer is firm for 1 week.
Please add USD0.30 for ROHS.
Printing logo: logo set up charge: USD100.00/design.
Sample delivery time is 3-5 day after order confirm.
Delivery time: 7-10 day after sample approval.

Should any of the items be of interest to you, please let us know. We shall be glad to give you our lowest quotations upon receipt of your detailed requirement.

IFS electronice company limited


Solar dick!

Yep, that's an electronice solar dick all right.

(I bet they'll print whatever famous computer-product-company logo you like on your 500 solar dicks.)

"The suspect is 1,828,800 microns tall, and his irises reflect 465-nanometre light..."

A reader wrote to tell me that he'd replicated my ice-resistance-measuring experiment, with the same results - about ten million ohms per inch. Then he said:

...although in Oz, shouldn't that have been centimetres?

This pressed one of my numerous Talk Buttons, so I thought I'd pour my canned rant on this subject out into a blog post where you all have to put up with it, rather than only favouring that one correspondent with my deathless wisdom.

Because nobody's forcing me to stick to a style guide, I freely mix metric and imperial units - doing my best to avoid the traps that lie therein - when I think it's appropriate.

Fractions of inches are seldom useful for anything (to me), and are a pain to work with too - I've got a lovely little German Imperial-unit vernier caliper that confuses the heck out of me every time I try to use it. Metric vernier scales are easy, but the imperial one is another of those things that slither out of my brain as soon as I put the caliper down.

But metric units just don't come in the right sizes for some measurements. "About an inch", as in the ice-resistance measuring, clearly conveys the rough-eyeball-distance-measuring I was doing. The metric equivalent either suggests an excessive level of precision ("about 2.5cm" gives the impression that the range is no more than 2.3 to 2.7...), or is cumbersome ("between 2 and 3cm").

My favourite example of not-so-useful metrication is in measuring human height. Australian publications usually have a style guide that forbids feet and inches, or at least requires metric equivalents to be added in brackets. So "the suspect in the Brooklyn Slasher murders has been described as being about 6 feet tall" becomes "...about 183cm tall", which again suggests more precision than actually exists in the measurement.

Some people might even say "182.9cm" in this situation, giving the impression that someone's measured the suspect with a micrometer. Since a person's height can easily change by more than an inch depending on what shoes they're wearing and slight changes in posture, I think most human height measurements with precision beyond the inch level are actively misleading.

(Wikipedia has a good little article on "false precision". And here's a piece on seeing false precision where it in fact does not exist. I ramble on about the limits to precision in real-world measurement here.)

NOTE: Clearly-enunciated bad language within

Perhaps it was the firewall that irked Stephen Fry so.

Stephen Fry vs Vista
(Via. Mr Fry is, of course, not actually very unflappable at all, as listeners to his podcasts already know.)

Further information.

I'm glad it's not just me and people on b3ta who use the word "cunting".

Sometimes, nothing else will do. There usually seems to be a machine involved.