News: 1747639814

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Automatic UK-to-US English converter produced amazing mistakes by the vanload

(2025/05/19)


Who, Me? Translating one's life from the wonders of the weekend to the madness of a Monday is never easy, but The Register tries to ease the change by delivering a new installment of Who, Me? It's our reader-contributed column in which you admit to making messes and share your escape routes.

This week, meet a reader we'll Regomize as "Colin" who told us about his time working as a front-end developer for an education company that decided the time was right to expand from the UK to the US.

"Suddenly we needed to localize thousands of online articles, lessons, and other documents into American English."

[1]

Inconveniently, all that content was static HTML. "There was no CMS, no database, nothing I could harness on the server side," Colin lamented to Who, Me?

[2]

[3]

After due consideration, Colin and his team decided to use regular expressions to do the job.

"Our system combined tackling spelling swaps like changing 'ae' to 'e' in words like 'archaeology' and word/phrase swaps so that British terms like 'post' were changed to the American 'mail.'" Colin knew this could go pear-shaped if the system changed a term like "post-modern" to "mail-modern," so compound words were exempt.

[4]

As Colin and his workmates considered all the necessary changes, they realized they needed a lot of rules.

"The fact it was running the replacements directly on the body HTML, and causing lots of page repaints, meant we had to build a REST API to cache which rules ran and didn't run for each page, so as to not cause slowdown by running unnecessary rules," he explained.

Which worked well until it didn't.

[5]So your [expletive] test failed. So [obscene participle] what?

[6]Teens maintained a mainframe and it went about as well as you'd imagine

[7]What the **** did you put in that code? The client thinks it's a cyberattack

[8]Developer scored huge own goal by deleting almost every football fan in Europe

"One day we got a call asking why a lesson about famous artists referred to the great painter 'Vincent Truck Gogh.'"

Readers are doubtless familiar with Vincent Van Gogh, and the different names for midsize vehicles on each side of the North Atlantic.

[9]

That was just the start. Next came complaints about a religious studies lesson that explained how Adam and Eve lived in the "Yard of Eden" – not the garden. Another religion class mentioned sinister-sounding "Easter hoods" instead of the daintier "Easter bonnets."

Colin figured out that the word swaps he coded failed to consider cases where it should just skip a word altogether. A van, after all, is a truck if you're American.

"In the end, we managed to get the system to be context-aware, so that certain swaps could be suppressed if the article contained a certain trigger word which suggested it shouldn't run, and the problems went away. But it was a very entertaining bug to be involved with!"

Have bad translations led you to make a mistake? If so, [10]click here to send us your story. We'd love the chance to translate it into a story we share in a future Who, Me? ®

Get our [11]Tech Resources



[1] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2aCsBPls9Y8CBTdjUR5gVcgAAAVc&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0

[2] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aCsBPls9Y8CBTdjUR5gVcgAAAVc&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[3] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aCsBPls9Y8CBTdjUR5gVcgAAAVc&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aCsBPls9Y8CBTdjUR5gVcgAAAVc&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[5] https://www.theregister.com/2025/05/12/who_me/

[6] https://www.theregister.com/2025/05/05/who_me/

[7] https://www.theregister.com/2025/04/28/who_me/

[8] https://www.theregister.com/2025/04/21/who_me/

[9] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aCsBPls9Y8CBTdjUR5gVcgAAAVc&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[10] mailto:whome@theregister.com

[11] https://whitepapers.theregister.com/



Surely simpler to stick with correct English

Empire of the Pussycat

Foreigners will eventually learn to speak it proper like wot we do.

Re: Surely simpler to stick with correct English

Korev

Innit

A pint of Real Ale for you, not (US) Budweiser

Re: Surely simpler to stick with correct English

Anonymous Coward

A full-size pint at that

Re: Surely simpler to stick with correct English

Anonymous Coward

It depends on how much you want to drink. Personally, I'd prefer an Imperial pint of a good ale than a (larger) US pint of Bud (or similar left-pondian mouthwash).

Re: Surely simpler to stick with correct English

JulieM

And it had better be the full 568ml.

Re: Surely simpler to stick with correct English

gnasher729

“ A pint of Real Ale for you, not (US) Budweiser”

What about a real Czech Budvar.

Re: Surely simpler to stick with correct English

Primus Secundus Tertius

Like wot we does, surely.

"Vincent Truck Gogh"

Pascal Monett

No problem.

Soon, we're going to marvel at how pseudo-AI is capable of understanding us so well.

Then, we're going to wonder why Terminator was a documentary . . .

Korev

They must have used Colin's code to mess up this site too!

Dinanziame

An education company... translating with regexes... yeah, no way this can go wrong.

seven of five

Like a hand grenade in kindergarten...

Korev

>"Our system combined tackling spelling swaps like changing 'ae' to 'e' in words like 'archaeology' and word/phrase swaps so that British terms like 'post' were changed to the American 'mail.'"

I was wondering if this was going to screw up the web service calls

Icon as he'll need some SOAP soon

Whoops

ColinPa

On my first trip to the US I had to present to a room full of developers. I had given it several times in the UK. After an hour I put up a chart saying "Fag break" - every one sat there bemused.

I said "Cigarette break?!" and every one got up.

I also learned on that trip that you park on the driveway and drive on the parkway.

At a conference where there was simultaneous translation from English into French, German and Italian, someone was going though a dump

"At offset Baker Dog Dog"... there was laughter as the French heard ".. Boulanger, Chien, Chien"

Re: Whoops

Flocke Kroes

If you pencil in something that you later need to change, remember to ask for an eraser.

Re: Whoops

cyberdemon

At least he had his own, and didn't have to ask if he can "bum a fag"

Re: Whoops

Anonymous Coward

With the current morons in charge that could get you locked up :(.

Re: Whoops

Anonymous Coward

Or a night out with some of them.

Re: Whoops

gnasher729

“ With the current morons in charge that could get you locked up :(.”

You mean he has an exclusive? And which way round?

Re: Whoops

Rtbcomp

A colleague of mine was working on a machine in Canada and he needed to clean the edge connectors on the circuit boards so asked if anyone had a rubber on them.

Re: Whoops

Anonymous Coward

If you pencil in something that you later need to change, remember to ask for an eraser.

Unless it's the date with your tottie.

I recall my schoolroom French also had la gomme which is pretty much the English( UK) without the the double entendre but going to a French lumber yard and asking for a "preservative" might occasion some Gallic hilarity and the directions to the nearest apothecary.

Re: Whoops

BBRush

The "Trousers - Pants - Shorts" debate is the one that keeps giving for me. Trying to explain that, or at least the differences between British English and US English versions of them, to Swedes is always fun.

Re: Whoops

Alberto Malich

This gets even worse if you're from Lancashire - I've only ever heard "pants" used here like NA do, and tends to set off the rest of the UK just the same.

No idea what the "shorts" bit is though, I read them as those half-pants (or half-trousers, if you like).

Re: Whoops

KarMann

I'm a belt, braces, and suspenders guy, myself. And yes, I have been known to cut down a tree or two.

Re: Whoops

Anonymous Coward

Buttered scones for tea on Wednesday?

Re: Whoops

that one in the corner

Are you okay?

Re: Whoops

John Sager

Having been to the US many times in the past, I got into an auto-translate mode in my mind, so I used American words and expressions in a Northern England accent.

It generally worked though later as a tourist we travelled through Carson City, NV which has a railroad museum. There was a rally of 50s period autos going on in the parking lot there which prompted confusion over the word 'car' in the museum. They use 'car' for our railway carriages...

Still happening

Andy Miller

Early this year a friend shared a screen-shot of his weather app predicting "Snow expected to autumn in the next hour".

Re: Still happening

MatthewSt

And Windows was telling me that it "Last ticked for updates" a month ago...

I can't find the details about it now but back in the Vista days one of the languages for Vista had an issue where they'd translated both "Sleep" and "Hibernate" into the same word

Doctor Syntax

"changing 'ae' to 'e' in words like 'archaeology'"

Do they really spell archaeology like that over there? It's another thing that could go wrong with Anglo-Saxon words where it would need to be rendered as an 'a'. e.g. Ælfred to Alfred.

Re: Do they really spell archaeology like that over there?

Captain Hogwash

Yes they do. See also Mediaeval. Furthermore, Americans with celiac disease sometimes get diarrhea. Whereas British coeliacs sometimes get diarrhoea.

But, Vincent Truck Gogh? That's confusing as the band called Camper Van Beethoven are American.

Re: Do they really spell archaeology like that over there?

Anonymous Coward

On the subject of bands & singers...

Repeat 100 times: "Randy Vanwarmer is not a funny name, he does not go dogging".

Re: Do they really spell archaeology like that over there?

JulieM

How do they spell the name of that bit of water between Greece and Turkey?

Argh!

Anonymous Coward

I work in the UK and my current customer is in the UK.

But the head office of my company is in the US, and so I have to write in Simplified English (American) spellings. It drives me up the wall.

The group head office is in Ireland where they also write English using the correct spellings.

I am waiting for the customer's customer (the MOD) to reject about 100,000 pages of documentation as it is in Simplified English, but no-one in the company will listen!

Re: Argh!

Anonymous Coward

Expect disappointment. Nobody reads documentation.

Re: Argh!

Anonymous Coward

I deliberately spell everything in UK English... Because it's the right thing to do.

If you wanted to get there, I wouldn't have started from here

Dan 55

First thing to do would have been to populate a CMS with the UK English version from the HTML data, then make the website serve from the CMS, then auto-translate from UK English to Merkin in the CMS, then proof read and tidy up the Merkin version.

Static HTML pages

that one in the corner

> running the replacements directly on the body HTML, and causing lots of page repaints, meant we had to build a REST API

Ok, IANAWD but - immediate response to "static HTML" was "tricksy problem, especially with regexes, but at least you only need to translate each static page once, manually patch any edge cases, save the new static pages in a directory en-us/". Put a trigger into version control on the en-gb/ originals (or set up Make...) to re-process when edited.

But - repaints? REST API? Not cause slowdown? Is this a new definition of "static" page?

Then again, this is a "Who, me?"...

[1] I Am Not A Web Dev

Re: Static HTML pages

seven of five

Going from WYSIWYG to WYSIWTF in three simple awk...

Sinister buttocks

Workshy researcher

That reminds me of the process known as "Rogeting" - using a thesaurus to replace words, often used by students to make their essays unique.

The classic is "sinister buttocks", a reference to the American "No Child Left Behind Act" of 2001.

Re: Sinister buttocks

Ivan Headache

“Rogeting”

How to you pronounce that?

Penny

elsergiovolador

Surely you mean Vinpenny truck gough?

Just one thing

KarMann

Americans' vans and British vans are pretty nearly entirely the same thing; it's trucks* and lorries where we disagree. And I don't think I've heard of Vincent Lorry Gogh.

* unless you're delving into some of the slightly more obscure meanings of 'truck', or its verbed form

John Riddoch

There was the AD&D magic item compendium where someone was obviously offended by the term "mage" as the updated class name was now "wizard" so they did a search and replace. Badly. So we now had magic items doing "6d6 points of dawizard" to the target.

Vincent truck Gogh

Anonymous Coward

I can imagine a flaming redheaded insane maga, homicidal gun crazy Vincent ' truck ' Gogh living in a trailer park † called Yard of Eden in the rather dangerous Easter Hoods of some urban planning disaster. ‡

A reworked A Clockwork Orange screenplay featuring this wholly ¶ American nutter could be the next Netflix triumph also with the very real likelihood of his being elected the next president of the US.

† caravan (~ trailer) park with more (white) trailer trash . ‡ any post-colonial US city. ¶ wholey

Vincent truck Gogh

Manolo

The word "van" is (almost) never capitalized in Dutch surnames, but certainly not for Vincent van Gogh.

It is Vincent van Gogh, meaning " from Gogh", which refers to the German city his ancestors hailed from.

And the -ogh is pronounced like in "loch", not like in "go".

See also

JulieM

See also that well-known holy site in India, the Harimandir Sahib, also known as the Golden Temple, at Amriczar.

Cornishinretirement

Why is it that Americans seem unable to deal with UK English. We in the UK seem to manage to deal with the US version. We have to. There is never any US to UK translation.

The Information Revolution will be fought on the command line.

-- From a Slashdot.org post