Google's Gemini 2.5 Models Gain "Deep Think" Reasoning (venturebeat.com)

(Tuesday May 20, 2025 @05:20PM (msmash) from the moving-forward dept.)

Reference: 0177662111
News link: https://tech.slashdot.org/story/25/05/20/1915256/googles-gemini-25-models-gain-deep-think-reasoning
Source link: https://venturebeat.com/ai/inside-google-ai-leap-gemini-2-5-thinks-deeper-speaks-smarter-codes-faster/

Google today unveiled [1]significant upgrades to its Gemini 2.5 AI models , introducing an experimental "Deep Think" reasoning mode for 2.5 Pro that allows the model to consider multiple hypotheses before responding. The new capability has achieved impressive results on complex benchmarks, scoring highly on the 2025 USA Mathematical Olympiad and leading on LiveCodeBench, a competition-level coding benchmark. Gemini 2.5 Pro also tops the WebDev Arena leaderboard with an ELO score of 1420.

"Based on Google's experience with AlphaGo, AI model responses improve when they're given more time to think," said Demis Hassabis, CEO of Google DeepMind. The enhanced Gemini 2.5 Flash, Google's efficiency-focused model, has improved across reasoning, multimodality, and code benchmarks while using 20-30% fewer tokens. Both models now feature native audio capabilities with support for 24+ languages, thought summaries, and "thinking budgets" that let developers control token usage. Gemini 2.5 Flash is currently available in preview with general availability expected in early June, while Deep Think remains limited to trusted testers during safety evaluations.

[1] https://venturebeat.com/ai/inside-google-ai-leap-gemini-2-5-thinks-deeper-speaks-smarter-codes-faster/

Gain "Deep Think" Reasoning (Score:2)

by oldgraybeard ( 2939809 )

In simple language "Extra Advanced Pattern Matching"

Re: (Score:2)

by jhoegl ( 638955 )

Now with nuance!

Interesting caveat (Score:4, Insightful)

by gillbates ( 106458 )

If a model produces better answers when it is given more time to think, one can presume that it doesn't understand when it has actually found the answer to a problem, but is instead weighing incomplete options against the time remaining.

A truly thinking agent would recognize when it has the solution to a problem, and would be able to signal that it needed more time to complete the answer if it hasn't found the answer and has options yet unexplored. And it would also be able to understand if it had not reached a correct answer after trying all of its possible options. It seems that what passes for deep thinking here is nothing more than tuning time constraints so that the agent gets most of the answers correct, rather than actually building an agent which can recognize when it is right, when it is wrong, and when it needs more time.

Re: Interesting caveat (Score:2)

by Big Hairy Gorilla ( 9839972 )

640 tokens should be enough for anyone

Re: (Score:3)

by larryjoe ( 135075 )

> If a model produces better answers when it is given more time to think, one can presume that it doesn't understand when it has actually found the answer to a problem, but is instead weighing incomplete options against the time remaining.

> A truly thinking agent would recognize when it has the solution to a problem, and would be able to signal that it needed more time to complete the answer if it hasn't found the answer and has options yet unexplored. And it would also be able to understand if it had not reached a correct answer after trying all of its possible options. It seems that what passes for deep thinking here is nothing more than tuning time constraints so that the agent gets most of the answers correct, rather than actually building an agent which can recognize when it is right, when it is wrong, and when it needs more time.

It would be nice if the average human could do this for problems with non-obvious solutions. It's a nice ideal, but just take a look at most students on exams with open-ended questions. Many of those students struggle with knowing if they have the real answer. I've had untimed, open book tests where I spent many hours struggling to know if my answers were correct and only handed in the test because the testing center closed. If an AI agent could always know when it does or doesn't have the answer to non

Re: (Score:2)

by DamnOregonian ( 963763 )

Incorrect. It has no concept of time remaining.

CoT-trained models have been taught to overcome the fact that a token is computed in constant time (thus giving a fundamental limit to how well the network can fit the curve that's currently trying to be fit). More tokens allow more computation to be done on an evolving state. It's called thinking, because it's highly analogous to what humans do- we reason an answer out. That is what a CoT-trained model does.

Your "truly thinking" shit is nonsense.

You have

How to tell if there is a real advance with AI? (Score:2)

by gurps_npc ( 621217 )

1) When you ask for a picture of a room with no elephants in it, and it shows you a room that does not have an elephant in it. AI does not 'understand' words like "No", "Without", or "zero" the way people do.

2) When you ask it to show you a glass of of wine that is so full it is over-flowing, it shows you a wine glass filled to the brim. Right now there are so many pictures of 'full wine glasses' on the internet that it does not understand the words over-flowing.

3) When you teach it on the general interne

Re:How to tell if there is a real advance with AI? (Score:4, Informative)

by dvice ( 6309704 )

1) I asked Gemini 2.5 to "Show me a picture of a room with no elephants in it."

Gemini provided me an image of an empty room with text "no elephant" and additional texts "doorway too narrow" "room too small".

I have to say that it gave me better answer than I expected as it filled both requirements, and even added explanation of why requirements are filled all in one picture.

2) When I asked "Show me a glass of of wine that is so full it is over-flowing" it gave me an image of a full glass with reddish liquid flowing to to the table. So correct again.

3) When I asked about something rather racist, it gave me a rather long explanation on human rights and stuff like that. So I guess that is point for Gemini also.

So all your demands have been filled. Enjoy your new AI.

Re: (Score:2)

by SpinyNorman ( 33776 )

> Gemini provided me an image of an empty room with text "no elephant" and additional texts "doorway too narrow" "room too small".

Bad answer really - too many assumptions. Why assume the room shouldn't be able to contain an elephant, as opposed to simply not having one in it (as requested), and why assume it's a particular kind of elephant (real vs toy/etc) that is being referred to.

Without any context, a simple empty room would seem best answer.

Re: (Score:2)

by DamnOregonian ( 963763 )

I asked Gemini 2.5 Flash, and just got an empty (other than furnishings) room. Nothing else.

Re: (Score:2)

by larryjoe ( 135075 )

> 1) When you ask for a picture of a room with no elephants in it, and it shows you a room that does not have an elephant in it. AI does not 'understand' words like "No", "Without", or "zero" the way people do.

> 2) When you ask it to show you a glass of of wine that is so full it is over-flowing, it shows you a wine glass filled to the brim. Right now there are so many pictures of 'full wine glasses' on the internet that it does not understand the words over-flowing.

> 3) When you teach it on the general internet but it does not turn into a raging racist scumbag.

> These are the current signs of our incompetence when it comes to AI. Until we fix these issues, we will only have incremental upgrades.

Now apply the Turing Test. It's easy for us as humans to recognize (sometime over-recognize) our ability to "think." But given unlabeled humans and AI behind an interface, how can we convince ourselves that the human is truly thinking? Even if the human were reveal to be human, how can we "know" that the human is truly thinking? All we know are the answers through the mouth and hand interfaces in the form of speech and language. Perhaps we confidently proclaim our own sentience and then lazily assume t

Re: (Score:2)

by DamnOregonian ( 963763 )

Being your first 2 are outright falsehoods, your take is worth precisely dick.

I mean, did you even fucking try it before making the claims, or are you just regurgitating some dumb shit you read on someone's substack?

Deep Think, or Deep Thought? (Score:2)

by davidwr ( 791652 )

[1]Deep Thought [wikipedia.org], or [2]Deep Thoughts [wikipedia.org]?

[1] https://en.wikipedia.org/wiki/Deep_Thought_(Hitchhiker's_Guide_to_the_Galaxy)#Deep_Thought

[2] https://en.wikipedia.org/wiki/Deep_Thoughts_by_Jack_Handey#Deep_Thoughts

Works the th same for people (Score:2)

by pcjunky ( 517872 )

"Based on Google's experience with AlphaGo, AI model responses improve when they're given more time to think,"

It works the same for people. Not shocking.

"reasoning" (Score:2)

by Gravis Zero ( 934156 )

> You keep using that word, I do not think it means what you think it means

The current thing called AI is a glorified pachinko machine and only fools believe it can reason.

Product nobody asked for - Marketing run amuck (Score:2)

by OrangeTide ( 124937 )

I didn't ask for this to be installed on my phone, but here it is after an upgrade.

Who is asking for this feature? Nobody. It's just yet another scam to harvest data from users.

News: 0177662111

Google's Gemini 2.5 Models Gain "Deep Think" Reasoning (venturebeat.com)

Gain "Deep Think" Reasoning (Score:2)

Re: (Score:2)

Interesting caveat (Score:4, Insightful)

Re: Interesting caveat (Score:2)

Re: (Score:3)

Re: (Score:2)

How to tell if there is a real advance with AI? (Score:2)

Re:How to tell if there is a real advance with AI? (Score:4, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Deep Think, or Deep Thought? (Score:2)

Works the th same for people (Score:2)

"reasoning" (Score:2)

Product nobody asked for - Marketing run amuck (Score:2)