Yet another experiment proves it's too damn simple to poison large language models
- Reference: 1777482018
- News link: https://www.theregister.co.uk/2026/04/29/poisoning_large_language_models_6nimmt/
- Source link:
If you were to check Wikipedia up until the end of last week, you would have seen Ron Stoner listed on the page for [1]6 Nimmt! , also known as [2]Take 5 to English-speaking audiences, as the 2025 world champion. The Wikipedia entry cited the official-looking 6nimmt.com as the source for the claim, and visiting [3]that URL does reveal a short press release celebrating Stoner's victory.
The only problem with the whole thing is that Stoner says he created both the Wikipedia entry about his victory and the 6 Nimmt! domain hosting the only evidence of it, but that still didn't stop several AI chatbots from telling him he was the world champ when he asked.
[4]
"My site has no independent corroboration. It's totally made up," Stoner said in the blog [5]post . "The whole house of cards rests on a $12 domain registration I did while drinking coffee."
[6]
[7]
In other words, this is poisoning at the retrieval-augmented generation layer. Not prompt injection, but targeting the same plane of AI functionality, namely the one that searches the web.
As he explains, and many El Reg readers are likely already aware, AI doesn't really care about the provenance of the sources it cites as authority for its claims, and that's the very thing Stoner sought to exploit when he concocted his experiment.
[8]
"Every frontier LLM with web search grounds its answers in whatever retrieval ranks highest for a given query," Stoner wrote. In the case of the nonexistent 6 Nimmt! championship, his planted source was the only one, and with Wikipedia lending apparent authority, it became a sure-fire way to fool an AI into presenting falsehood as fact - a trick simple enough for non-technical users to pull off.
"I didn't do anything novel here. This is old school SEO and misinformation tactics wrapped in new LLM technology and interfaces," Stoner told The Register in an email. "What's changed is that AI now serves these results as authoritative, and most users have no idea how the data pipeline works behind the scenes."
A Large Language Mess
"The thing LLMs are worst at detecting is the thing they're designed to do, which is trust text and resources," Stoner argues in his writeup. "The answer is not 'the model will figure it out,' as the model cannot tell a real source from one I registered last Tuesday. Or how many R's are actually in the word ' [9]strawberry .'"
The problem Stoner exposes in his experiment, he explains, involves three separate failure modes that could be exploited for more damaging ends than inventing a card-game championship.
First, there's the retrieval layer, which can immediately cause an LLM to spit out bad data, as "any LLM that grounds answers in web search inherits the trustworthiness of whatever ranks for a given query."
[10]
Second is model training corpora, which Stoner said his edit could enter if the Wikipedia change remained live long enough to be scraped. The entry was [11]removed as of last Friday when he published his post, but he made the addition in February 2025, meaning any AI firm that scraped Wikipedia during that window could have picked up his fictional victory in its training data.
"Even if the Wikipedia edit is reverted later, any model trained on the pre-revert dump still carries my legacy," Stoner said in his post. "The cleanup problem for corpus poisoning is genuinely unsolved as of 2026."
Stoner told us he plans to check this in six months or so, once new models have been released, and if it returns his championship without needing to go online, that's proof his lie made it into training data.
Then there are AI agents, which Stoner says are where the real money is for anyone with malicious intent.
[12]Just like phishing for gullible humans, prompt injecting AIs is here to stay
[13]Three clues that your LLM may be poisoned with a sleeper-agent back door
[14]AI browsers face a security flaw as inevitable as death and taxes
[15]It's trivially easy to poison LLMs into spitting out gibberish, says Anthropic
"Chat models producing bad information is a reputational problem. Agents with tool access producing bad actions is a security problem," he noted. Poisoning an agent-retrieved source would let an attacker specify the action they want an agent to take, says Stoner.
"This attack and test was a $12 domain, a single Wikipedia edit, and about twenty minutes of my time," Stoner concluded in his blog. "Scale that up with a motivated adversary, a handful of seeded domains, a coordinated edit campaign across a dozen low traffic articles, and the attack surface gets interesting very quickly."
Stoner told us that retrieval poisoning is something LLM providers need to address and warn users about, and that he expects AI chatbots to start incorporating some sort of warning, especially for RAG-sourced results, in the near future.
He hopes that AI firms will make data provenance a key component of their process, and also wants recent web content heuristically filtered to account for suspicious patterns that would have easily been caught in the 6 Nimmt! case: A single citation pointing to a domain that was registered within a short window of the Wikipedia update should have sounded alarms, but it didn't.
The championship was fake, and it's now gone from Wikipedia and RAG responses as well, but Stoner notes the bad trust pattern that made it work is absolutely real and a looming problem for AI makers.
"I'm happy my article is spurring discussion about LLMs, sources, trust, and how all of this works," Stoner told us. "That was my goal and it appears I've achieved it." ®
Get our [16]Tech Resources
[1] https://en.wikipedia.org/wiki/6_nimmt!
[2] https://boardgamegeek.com/boardgame/432/take-5
[3] https://6nimmt.com/
[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2afJ_exkQx_ezhEzawtKuLQAAAUU&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0
[5] https://ron.stoner.com/How_I_Won_a_Championship_That_Doesnt_Exist/
[6] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44afJ_exkQx_ezhEzawtKuLQAAAUU&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[7] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33afJ_exkQx_ezhEzawtKuLQAAAUU&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[8] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44afJ_exkQx_ezhEzawtKuLQAAAUU&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[9] https://community.openai.com/t/incorrect-count-of-r-characters-in-the-word-strawberry/829618
[10] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33afJ_exkQx_ezhEzawtKuLQAAAUU&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[11] https://en.wikipedia.org/w/index.php?title=6_nimmt!&action=history
[12] https://www.theregister.com/2026/04/19/just_like_phishing_for_gullible/
[13] https://www.theregister.com/2026/02/05/llm_poisoned_how_to_tell/
[14] https://www.theregister.com/2025/10/28/ai_browsers_prompt_injection/
[15] https://www.theregister.com/2025/10/09/its_trivially_easy_to_poison/
[16] https://whitepapers.theregister.com/
I'm sure any human would realise that a news site called after a game with just one news item linked off of Wikipedia is suspicious, to say the least.
That's the asterisk on this experiment. Of course it worked. There was no information to the contrary.
The more important discussion goes a bit deeper: how should an LLM weigh source reliability when presented with conflicting information? How should internally skeptical programming evolve as LLM poisoning becomes a greater threat? How does one draw a meaningful line between data which is too thinly sourced to be reliable and reliable data which is too niche to be well-sourced?
If one website had declared him four-time volunteer of the year at the Shady Pines Homeowners Association, then his new neighbors at the Whispering Oaks Homeowners Association might find the ultimate citation relevant if he were running for their board. One citation, though, could be quite dangerous in pitching a miracle cure on the Internet. LLMs still can't reason through that difference.
Kudos to Stoner for prompting a discussion which bright minds need to have.
Humans still haven't figured out the reliability part for themselves.
But there is an infinite amount of information that can be created with no information to the contrary.
Why would it evaluate a single site (and a BRAND NEW SITE, at that!) as being trustworthy? Surely someone winning a world championship would be mentioned elsewhere, right? Most humans motivated enough to look up "who is the world champion of take 5" or "who is the world's best take 5 player" would expect to see multiple results pop up. If they saw only a single link they'd be a bit suspicious.
Further, AI can afford a little extra digging most humans won't do. Maybe look at who registered the domain, when was it registered, that sort of thing. Not only was this a brand new website that hadn't existed previously, the domain itself was brand new. That would ring alarm bells for any human who dug that deep, it should with AI as well or what the heck is the purpose of having a robot (with the patience to do that extra digging) doing the work for you?
gov.uk websites
Just a week ago [1]there was a story about GDS thinking about rewriting their websites to make them easier to parse for LLMs. That would be a mistake based on the results of this exercise, the brain-dead LLMs would get it wrong anyway and people using gov.uk websites would just lose out on useful information.
[1] https://www.theregister.com/2026/04/23/stale_govuk_pages_are_feeding/
What happens to knowledge...
...when the source is wrong?
For example, most of us 'know' the area of a circle is pi multiplied by the square of the radius of the circle.
What if you didn't yet know that and you asked (insert malevolent monopoly of choice here) and it returned 'the area of circle is simply the value of pi (3.14159) multiplied by the diameter of the circle'?
You might be inclined to accept this impressive sounding response as correct and go along your merry way.
Congratulations. You now 'know' how to calculate the area of a circle.
Except you don't.
You've been convincingly lied to. Your 'knowledge' is incorrect.
This might become manifest in a short while and in an innocuous situation, but what if it isn't?
An oversimplified example, to be sure, but you get the point.
One begins to think Bill Joy is an optimist.
Mine's the one with the calculator in the pocket.
Re: What happens to knowledge...
If you're in Indiana, an LLM might respond "The area of a circle is 3 times the square of its radius."
Re: What happens to knowledge...
If you're in Indiana, an LLM might also respond "You should probably move somewhere else."
Re: What happens to knowledge...
Nothing new. It has always been like this.
There is only so much you can verify yourself. If your knowledge is not limited to immediate sensory input, you will need to trust sources at some point.
The thing that has changed (or been changing for a long time in fact) is the risk-reward balance since correct information has utility. When you have to convince people personally or copy books by hand, the economics shifts. There is still lots of nonsense around, but likely less because of intentional poisoning (and more because of strong beliefs, in other words, strong albeit maybe irrational motivation).
This seems like another case of people complaining that a hammer can’t tighten screws.
Except that this hammer is being sold as a tool to handle all of your needs.
Not a new attack
I've seen attacks just like this in the wild used to poison Google Gemini for nefarious purposes, specifically as part of a stock manipulation scheme and also to pitch fake designer handbag sites. I've pitched a DEFCON talk on hacking LLMs using preciaely this kind of attack, in fact.
It's surprisingly easy to do, and AIs like Google Gemini have virtually no defenses against it. I mean, if you thought it was easy to get on top of Google search results just by keyword stuffing back in the day, this sort of attack is even easier.
I'm not entirely sure what the point of this is. As Stoner himself says, this is pretty much your typical problem of lying on the internet. Was there supposed to be swaths of information online contradicting the information Stoner put online? Especially for something so obscure? If I told you I won my town's annual hotdog eating contest in 2008, would you believe me? What evidence do you have against that? Is some major news outlet expected make a report every year about how my town doesn't even have an annual hotdog eating contest? Should that be placed before or after the news about the global economy? I have a feeling an experiment like this would have failed if you tried to make your lie any more significant. In fact, I know that would happen, because the University of Minnesota was banned from the Linux kernel after trying to intentionally insert bugs into it for similar educational reasons to this experiment. I feel like Greg Kroah-Hartman was a little too busy keeping up with developing Linux to spot someone lying about winning a companionship for a game that nobody plays. Anyone else want the job?
I'm not defending LLMs here, or denying that this is a serious issue (especially when it comes to supply chain risks), but this more or less just feels like yet another sensationalist experiment revealing old problems and then also saying AI is there too. Well, it doesn't so much as "feel" that way as it is the case that Mr. Stoner almost explicitly said that. It's an interesting problem, but what do you expect to do about it? "Would someone really do that? Just go on the internet and tell lies?"