News: 0181402862

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Testing Suggests Google's AI Overviews Tells Millions of Lies Per Hour (arstechnica.com)

(Tuesday April 07, 2026 @05:00PM (BeauHD) from the lies-at-scale dept.)


A New York Times [1]analysis found Google's AI Overviews now answer questions correctly about 90% of the time, which might sound impressive until you realize that [2]roughly 1 in 10 answers is wrong . "[F]or Google, that means hundreds of thousands of lies going out every minute of the day," reports Ars Technica. From the report:

> The Times conducted this analysis with the help of a startup called Oumi, which itself is deeply involved in developing AI models. The company used AI tools to probe AI Overviews with the SimpleQA evaluation, a common test to rank the factuality of generative models like Gemini. Released by OpenAI in 2024, SimpleQA is essentially a list of more than 4,000 questions with verifiable answers that can be fed into an AI.

>

> Oumi began running its test last year when Gemini 2.5 was still the company's best model. At the time, the benchmark showed an 85 percent accuracy rate. When the test was rerun following the Gemini 3 update, AI Overviews answered 91 percent of the questions correctly. If you extrapolate this miss rate out to all Google searches, AI Overviews is generating tens of millions of incorrect answers per day.

>

> The report includes several examples of where AI Overviews went wrong. When asked for the date on which Bob Marley's former home became a museum, AI Overviews cited three pages, two of which didn't discuss the date at all. The final one, Wikipedia, listed two contradictory years, and AI Overviews confidently chose the wrong one. The benchmark also prompts models to produce the date on which Yo Yo Ma was inducted into the classical music hall of fame. While AI Overviews cited the organization's website that listed Ma's induction, it claimed there's no such thing as the Classical Music Hall of Fame.

"This study has serious holes," said Google spokesperson Ned Adriance. "It doesn't reflect what people are actually searching on Google." The search giant likes to use a test called [3]SimpleQA Verified , which uses a smaller set of questions that have been more thoroughly vetted.



[1] https://www.nytimes.com/2026/04/07/technology/google-ai-overviews-accuracy.html

[2] https://arstechnica.com/google/2026/04/analysis-finds-google-ai-overviews-is-wrong-10-percent-of-the-time/

[3] https://arxiv.org/abs/2509.07968



Great even the pol (Score:4, Insightful)

by DarkOx ( 621550 )

Well shoot, even the politicians jobs are not safe then!

Re: (Score:2)

by Powercntrl ( 458442 )

Actually, I think someone did try to run for elected office on the premise that they'd let an AI make all their decisions for them. I don't think it worked out for them.

I don't believe it (Score:5, Funny)

by TwistedGreen ( 80055 )

Alice laughed. "There's no use trying," she said. "One can't believe impossible things."

"I daresay you haven't had much practice," said Google. "Why, sometimes I've believed as many as six impossible things before breakfast."

Balderdash (Score:2)

by SlashbotAgent ( 6477336 )

The crommulence of AI responses is infallible and unimpeachable. This article is complete balderdash.

Re: (Score:2)

by Locke2005 ( 849178 )

I didn't think cromulent was a word... turns out The Simpsons writers invented it. Cromulence just means acceptability, and... you spelled it wrong.

Re: (Score:2)

by SlashbotAgent ( 6477336 )

AI said I am correct and that you're teh gey[sic].

So... there.

AI lies (Score:3)

by gary s ( 5206985 )

And non AI search results are pretty much all lies. Look at this, oh wait its an AD link...

Google's AI is so bad... (Score:1)

by ebunga ( 95613 )

I would rather use Grok.

Re: (Score:2)

by Tailhook ( 98486 )

the LLM model they're using for "AI Overview" is terrible. Obviously, they're doing that because it's a small model that runs fast, so it can handle the load of millions of queries a minute. I find that if you then click "Dive Deeper", the model improves to something usable, often completely contradicting the "Overview" slop.

It's not a good look. But I suppose they have to put "AI" out front, even when it's crap.

Re: (Score:2)

by Powercntrl ( 458442 )

> It's not a good look.

Yeah, it makes an extremely bad first impression. Anecdotally, everybody I know sees it as the slop on top of the search results that you just skip over.

I use gemini (Score:3)

by MpVpRb ( 1423381 )

It often gives excellent answers, but when it doesn't, the results are strange.

I asked for help writing code for an obscure hobby CNC control system.

It totally invented function calls and invented plausible documentation to explain how they worked and how to call them.

It totally missed the easy answer that involved calling an existing simple function and writing no new code.

If the answer doesn't exist on the internet, it appears to just make one up

Re: (Score:2)

by kellin ( 28417 )

Yep. I've read that generative AI doesn't say "I don't know the answer," but will just make something up instead.

I wanted to see how helpful Gen AI would be for an edge case to sort through a collection of heroes I have in a game I play. Right off the bat, I learned Gemini is the "most accurate." Anthropic was beyond worthless, OpenAI was maybe 50/50. Even so, I learned to Gemini's data before accepting its results. It definitely did not do as well as I originally thought, so I make sure it knows the stat

Re: (Score:2)

by Locke2005 ( 849178 )

> Yep. I've read that generative AI doesn't say "I don't know the answer," but will just make something up instead.

I've worked with people like that.

Re: (Score:2)

by kbrannen ( 581293 )

> Yep. I've read that generative AI doesn't say "I don't know the answer," but will just make something up instead.

Of course it will because "I don't know" isn't in the training data. If an LLM can't find good word associations, where a lot of the weights are very high, it can only work with the lower weight associations (unlikely to be right), and at worst will take the lowest weight association, which is probably guaranteed to be wrong. It would be nice if the models had a built-in rule such that if the weights fall below a certain threshold that the model would return "I don't know" or "I can't do that", but that's n

Re: I use gemini (Score:2)

by fluffernutter ( 1411889 )

It's an algorithm. It doesn't make a choice up to lie. It just doesn't resolve well due to lack of information and gives you the best it can do.

Re: (Score:2)

by Locke2005 ( 849178 )

So Trump isn't really a liar, he's just an extremely low information person?

Re: (Score:2)

by jd ( 1658 )

Gemini is exceptionally bad, as LLMs go. I really have no idea why it is so dreadful, even compared to other LLMs. It isn't context window. and it doesn't seem to be training material either.

Re: I use gemini (Score:2)

by drinkypoo ( 153816 )

It almost always gives shit answers. Any time I search for details of things I know about it jumps in to tell me some shit I know is wrong. Every. Fucking. Time.

Google's Response (Score:1)

by logjon ( 1411219 )

"That's not true if you only ask it the questions we want you to ask!"

Re: (Score:2)

by kellin ( 28417 )

Basically this. And that's an idiotic statement to make. Gen AI needs to be good at everything for it to be useful. I realize that's a hard thing to do in the beginning, and it will probably get better over time, but we all need to help it along in some way by feeding it correct data.

Google: "you are all freaks" (Score:2)

by Morromist ( 1207276 )

Google: "Why can't you search for normal things like everybody else? Our ai is great at answering questions like 'where to buy a tv?' and 'who is Leonardo DiCaprio dating?" and "weather". If those things don't satisfy your every need I don't know what to say. Just because we're a search engine doesn't mean you're supposed to use it to search for difficult to find things. Search for normal things like a normal person, assholes."

Re: (Score:3)

by Locke2005 ( 849178 )

"Here I am, brain the size of a planet, and they ask me to take you up to the bridge. Call that job satisfaction? 'Cause I don't."

I don't know about that (Score:5, Funny)

by rsilvergun ( 571051 )

I mean based on the president of the United states? Those are rookie numbers. Come on Google you can do better!

So what you're saying is... (Score:2)

by Locke2005 ( 849178 )

Google has implemented Trump Mode in their AI? Gemini has been forced onto my Android Auto against my will.

Re:So what you're saying is... (Score:5, Informative)

by ranton ( 36917 )

> Google has implemented Trump Mode in their AI?

No, they said Google tells the truth 90% of the time, not 10%.

Re: (Score:2)

by Powercntrl ( 458442 )

> Google has implemented Trump Mode in their AI?

Well, it hasn't bombed Iran and developed a craving for McDonald's hamberders yet, so Google's still got some work to do.

Better than humans none the less (Score:2)

by Zero__Kelvin ( 151819 )

If you ask the average human to use a non-AI search engine to find out the answer to 100 non-trivial questions I can assure you that you will get many more than 10 incorrect answers.

Re: (Score:2)

by ceoyoyo ( 59147 )

It would be interesting to compare the AI summary accuracy to

1) Hitting "I feel lucky"

2) A selection of average humans given no-AI Google search

3) A selection of average humans given AI+Google search

4) A selection of average humans

And people believe AI... (Score:3)

by mspohr ( 589790 )

According to an article here a few days ago, 70% of people just accept whatever AI tells them without thinking.

Re: (Score:2)

by jd ( 1658 )

But was that figure provided by AI?

Even if not, we all know that 793% of all statistics are invented.

Lies, bigger lies and statistics. (Score:2)

by devslash0 ( 4203435 )

That's AI models for you in a nutshell.

The New York Times, you say ? (Score:3)

by greytree ( 7124971 )

The New York Times ?

Sooo ... is one of those lies that NATO stands for "North Atlantic Treaty Organization" ?

Asking for a friend who remembers when the NYT wasn't full of biased shit.

Re: So when can it replace Trump? (Score:1)

by bussdriver ( 620565 )

I'd rather have a digital lying machine than the sub-human one we have right now. At least people will be more willing to ignore criminal orders because they are not in an AI cult. The AI do really like to start nuclear wars but nobody would follow those orders... But given how much AI produced slop from the White House already, we might just end up with a nuclear war... like we did tariffs against penguin island.

What a headline! (Score:2)

by dskoll ( 99328 )

At this rate, reality is going to put The Onion out of business by 2029.

It gives great car repair advice, too (Score:1)

by Powercntrl ( 458442 )

> AI Overview

> Removing the serpentine belt on a 2018 Chevy Bolt involves releasing tension from the automatic tensioner, which is best accessed from the passenger-side wheel well. Use a 15mm socket on a long breaker bar to rotate the tensioner clockwise, allowing you to slip the belt off the pulleys.

(in case anyone didn't get the joke, this is a real AI result Google just gave me, but the catch is that the Chevy Bolt is an EV and does not have a serpentine belt - or an engine, for that matter)

Unit conversion? (Score:2)

by Locke2005 ( 849178 )

How many Trumps is that?

Re: Unit conversion? (Score:3)

by Mr. Dollar Ton ( 5495648 )

Here, from the horse's mouth:

Summary

Assuming Google AI search gives 1 lie in 10 answers (10%), it is roughly one-eighth of a Trump (0.125).

In other words, you would need 8 AI lies to equal the concentration of misinformation found in a single normalized Trump output.

Would you like to apply this "Trump" unit to other historical figures or tech benchmarks to see how they stack up?

Re: Unit conversion? (Score:2)

by Mr. Dollar Ton ( 5495648 )

Here's the summary for some names you may know:

Elon Musk ~1.25 Trumps

Vladimir Putin ~1.20 Trumps

Muammar Gaddafi ~1.10 Trumps

Peter Thiel ~0.4 to 0.5 Trumps

Ursula von der Leyen ~0.15 Trumps

Google AI (Hypothetical) ~0.125 Trumps

Re: (Score:2)

by Locke2005 ( 849178 )

“Those are rookie numbers, you've gotta pump those numbers up!"

AI doesn't lie. (Score:2)

by dfghjk ( 711126 )

A lie is bad information provided intentionally. AI does not have intent.

Re: AI doesn't lie. (Score:3)

by Mr. Dollar Ton ( 5495648 )

Says who?

The AI's intent is defined by the way it is trained, and Gemini is trained to emphasize what the google executives want emphasized.

Google's AI does not impress. (Score:2)

by jd ( 1658 )

When I test the different AI systems, Google's AI system loses track of complex problems incredibly quickly. It's great on simple stuff, but for complex stuff, it's useless.

Unfortunately.... advice, overviews, etc, are very very complex problems indeed, which means that you're hitting the weakspot of their system.

Yeah! (Score:2)

by PPH ( 736903 )

That's the ticket!

Doctorow is right (Score:2)

by RobinH ( 124750 )

Tinkering with a word guessing machine to see if you can make it smart is like breeding horses to be faster and faster expecting one to give birth to a locomotive. Word guessing machines are cool but they're never going to actually "understand" what they're talking about.

This is what stochastic parrots do (Score:3)

by Arrogant-Bastard ( 141720 )

(Reference: [1]On the Dangers of Stochastic Parrots | Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency [acm.org])

The people/companies behind these models will keep trying to "fix" them by throwing ever-increasing amounts of computing power at them (with all the lovely real-world effects on everyone and everything) and by using ever-more-complex models. And yes, they'll perform better. But they're still just large exercises in statistics and linear algebra, they're still just stochastic parrots, and thus there's an upper bound that they may approach asymptotically -- but can't surpass.

That's not because they're broken -- which is why I put "fix" in quotes in the previous paragraph. It's because that's how they work: it's an intrinsic property of all such models and no amount of computing power and/or model tweaking can change that: all it can do is obfuscate it. And obfuscated problems are far worse than obvious problems.

[1] https://dl.acm.org/doi/abs/10.1145/3442188.3445922

Give it to us in Lies per GigaWatt! (Score:3)

by Fly Swatter ( 30498 )

We need numbers we can understand, saying 10 percent is too simplistic.

Ironically, this Slashdot summary title is a lie (Score:2)

by Zero__Kelvin ( 151819 )

It's ironic that the human(s) reporting this couldn't do so without (apparently) lying, in the title no less. The article talks about accuracy, and an inaccuracy is not a lie unless it is intentional. Of course whomever wrote the title is likely seeking to impose their own anti-AI bias to the story, and so chose to lie about what the study actually says.

Re: Ironically, this Slashdot summary title is a l (Score:2)

by Zero__Kelvin ( 151819 )

So AI can't be intelligent, but can be stupid? It seems AI is a lot like the typical person who posts as AC on Slashdot.

Re: (Score:2)

by Powercntrl ( 458442 )

Actually, it's a perfect cromulent use of the word "lie" to mean a falsehood with or without the intent of deception. [1]At least according to the dictionary. [merriam-webster.com]

[1] https://www.merriam-webster.com/dictionary/lie

Re: Ironically, this Slashdot summary title is a l (Score:2)

by Zero__Kelvin ( 151819 )

I can't tell if you are lying or mistaken.

Re: (Score:2)

by jd ( 1658 )

If something is inaccurately presented as being the truth, then it is a lie of omission because it is dishonest about the fact that the information isn't actually known.

Re: (Score:2)

by Zero__Kelvin ( 151819 )

A lie of omission is when pertinent information is withheld. I'm not even going to try to parse the rest of your nonsensical sentence.

Depends heavily on the subject matter. (Score:1)

by Narcocide ( 102829 )

I have noticed that asking it questions about the video game "No Man's Sky" elicits perfect or at least nearly perfect answers every time. Asking it any technical questions about Linux though... usable accuracy drops to something like 50%.

Compared to? (Score:2)

by bill_mcgonigle ( 4333 ) *

To be fair I just wasted a week tracking down a radio telemetry problem because of a forum post that many people said worked great but it definitely pulled a pin high that was supposed to be low, which shut off an antenna.

Only diving into the spec sheet and some sample embedded code convinced me that the forum post was exactly wrong and after making a simple change to do the opposite did all the telemetry devices mesh up and start reporting correctly.

So ... how does 90% compare to human content?

A wrinkle is

The world really isn't any worse. It's just that the news coverage
is so much better.