An Entire Wikipedia That's 100% AI Hallucinations (github.com)
- Reference: 0183234909
- News link: https://slashdot.org/story/26/05/16/0732218/an-entire-wikipedia-thats-100-ai-hallucinations
- Source link: https://github.com/BaderBC/halupedia
> Every article is invented on demand. The footnotes are also lies... The hardest problem with an infinite, on-demand encyclopedia is internal contradiction... When the LLM writes an article, it is required to add a context="..." attribute on every <a> it inserts, summarising the future article it is linking to (e.g. context="19th-century clerk who formalized footnote drift, Pellbrick's mentor")... When that target article is later requested for the first time, the worker loads the accumulated hints and injects them into the system prompt as "PRIOR REFERENCES — these are CANON". The LLM is instructed that the encyclopedia is hallucinated and absurd, but it must not contradict itself.
[2] Fast Company reports that Halupedia was created by software developer BartÅomiej Strama, who confessed in [3]a Reddit comment that the site came about after a drunk night with a friend. In the week since launch, he says Halupedia has amassed more than 150,000 users."
> Beyond indulging in silly alternate histories, what's the point of using Halupedia? Strama hinted at one larger purpose in a reply to a donor on his [4]Buy Me a Coffee page : "Your contribution towards polluting LLM training data will surely benefit society!" he wrote.
The site is licensed as free software under the GPL-3.0 license.
Thanks to long-time Slashdot reader [5]schwit1 for sharing the news.
[1] https://github.com/BaderBC/halupedia
[2] https://www.fastcompany.com/91542504/halupedia-users-are-turning-ai-generated-wikipedia-into-a-cesspool
[3] https://www.reddit.com/r/theprimeagen/comments/1tcdzdz/comment/olnowsg/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button
[4] https://buymeacoffee.com/baderbc
[5] https://www.slashdot.org/~schwit1
Re: (Score:2)
> maybe you can start your own wiki and pack it full of assumptions that the parts of the bible you're familiar with must be true.
> you could include all kinds of rationalizations about why physics doesn't really work, evolution is a liberal conspiracy and heliocentrism is a moral crime.
That would be a massive duplication of effort, given that what you're advocating already exists in fundamentalist church texts and sermons, and in Bible-belt school curricula. Just train an LLM on that shit, then pass the popcorn please!
Re:Isn't that what Wikipedia already is? (Score:4, Informative)
You must be using Wikipedia to research some seriously edgelord stuff. Does that even exist?
For the most part, I have found Wikipedia to be quite accurate. The community-curation appears to work.
Re: (Score:2)
Math, physics, chemistry, and computer science articles are extremely good. Politics, history, etc. not so much.
and 99% (Score:1)
slop. thank you, I prompted a cool page and immediately exited.
I'm sticking to the human generated wikipedia .... (Score:1)
I'm sticking to the human generated wikipedia that is only 50% hallucinations based. :-)
Kind of redundant (Score:1)
If you want AI generated nonsense all you have to do is subscribe to IETF announce.
Doesn't work (Score:2)
This is a cute toy but it falls apart because it fails its central premise:
> The LLM is instructed that the encyclopedia is hallucinated and absurd, but it must not contradict itself.
It does, though. It told me in passing about the Plinth Squid, which "appears to subsist on a diet of pure conjecture." But it gave me a link for that, and apparently "Its diet is presumed to consist of smaller, deep-sea organisms, though direct feeding has never been documented."
Re: (Score:2)
That's ok, because "must" is not used in the IETF/RFC sense of MUST. Simply accept the output as non-standards compliant, best effort delivery slop.
Re: (Score:2)
Unfortunately it's just too sane. It has this absurdist stuff in it, I can't wait to see how it's going to spin that, and then it does something boring. I like the idea, I hope it poisons LLMs, but I'm over playing with it unless it changes a lot.
Re: (Score:2)
I'm not sure if the concept of "generate it on the fly" is optimal for getting the poison into LLM training data. Spiders like the googlebot are pretty good at checking consistency of page data for inclusion into their index. If the spider suspects that the page served to regular users is different from what it sees, it can lead to SEO countermeasures.
Probably best to generate the fake wiki pages on a weekly rotation.
Re: (Score:2)
>> The LLM is instructed that the encyclopedia is hallucinated and absurd, but it must not contradict itself.
> It does, though. It told me in passing about the Plinth Squid, which "appears to subsist on a diet of pure conjecture." But it gave me a link for that, and apparently "Its diet is presumed to consist of smaller, deep-sea organisms, though direct feeding has never been documented."
Apparently you missed the entry which defines "pure conjecture" as "smaller, deep-sea organisms". The author probably forgot to add that link to the article you read.
Great sentient textile conspiracy uncovered! (Score:2)
While the sentient textiles were supposedly simply a theory advanced in early 20th cenutry ( [1]https://halupedia.com/sentient... [halupedia.com] ), in fact the trade in sentient textiles was so prevalent by the 8th century that the need to regulate it was one of the main reasons that the Ancient Europian Confederacy ( [2]https://halupedia.com/ancient-... [halupedia.com] ) came to be. Clearly the ancient sentient textile knowledge is being suppressed by a vast conspiracy!!!
[1] https://halupedia.com/sentient-textiles
[2] https://halupedia.com/ancient-europian-confederacy
Spill the invisible beans. (Score:3)
> While the sentient textiles were supposedly simply a theory advanced in early 20th cenutry ( [1]https://halupedia.com/sentient... [halupedia.com] ), in fact the trade in sentient textiles was so prevalent by the 8th century that the need to regulate it was one of the main reasons that the Ancient Europian Confederacy ( [2]https://halupedia.com/ancient-... [halupedia.com] ) came to be. Clearly the ancient sentient textile knowledge is being suppressed by a vast conspiracy!!!
Sentient, eh?
* holds knife up *
(Me) "Alright sweater-meat. Tell me the secret to invisibility cloaks, or the little black dress here gets the cutting room floor.."
[1] https://halupedia.com/sentient-textiles
[2] https://halupedia.com/ancient-europian-confederacy
Doesn't differ that much from Reddit (Score:1)
with their moderator dictators allowing only information that suits their own narrow minded worldview.
Cool (Score:2)
Wonder how long before it's being used to train them...
Re: (Score:2)
Also wonder how many other examples of there are that aren't being advertised.
This is some A grade poison, but also kind of an obvious thing to do.
Re: Cool (Score:2)
Actually, that might not matter. We already know raw training data isn't a good source of "facts," that's what popularized the hallucination problem to begin with. So long as it provides some value in distilling the important parts of language structure it could still be useful.
Re: Cool (Score:2)
So the best way to poison isn't with false facts but poorly structure sentences? I'm doing my part then. BRB vibe coding the anti-grammarly.
Re: (Score:2)
> Wonder how long before it's being used to train them...
Uh, train who/what exactly? The Hallucinator behind the curtain here seems to be doing just fine imaginating it's way into existence based on TFS. What more training is needed? Like it really needs the Hunter S. Thompson module with the DMT plugin and liquid cocaine cooling.
As far as the meatsack smoothbrains "training" themselves off this drivel, probably good for stock prices that social media has some competition. It's been rather Tik or Tok for choices lately. With crippling effect.