Advanced Version of Gemini With Deep Think Officially Achieves Gold-Medal Standard at the International Mathematical Olympiad (deepmind.google)
- Reference: 0178432740
- News link: https://science.slashdot.org/story/25/07/21/198231/advanced-version-of-gemini-with-deep-think-officially-achieves-gold-medal-standard-at-the-international-mathematical-olympiad
- Source link: https://deepmind.google/discover/blog/advanced-version-of-gemini-with-deep-think-officially-achieves-gold-medal-standard-at-the-international-mathematical-olympiad/
> The International Mathematical Olympiad is the world's most prestigious competition for young mathematicians, and has been held annually since 1959. Each country taking part is represented by six elite, pre-university mathematicians who compete to solve six exceptionally difficult problems in algebra, combinatorics, geometry, and number theory. Medals are awarded to the top half of contestants, with approximately 8% receiving a prestigious gold medal.
>
> Recently, the IMO has also become an aspirational challenge for AI systems as a test of their advanced mathematical problem-solving and reasoning capabilities. Last year, Google DeepMind's combined AlphaProof and AlphaGeometry 2 systems achieved the silver-medal standard, solving four out of the six problems and scoring 28 points. Making use of specialist formal languages, this breakthrough demonstrated that AI was beginning to approach elite human mathematical reasoning.
>
> This year, we were amongst an inaugural cohort to have our model results officially graded and certified by IMO coordinators using the same criteria as for student solutions. Recognizing the significant accomplishments of this year's student-participants, we're now excited to share the news of Gemini's breakthrough performance. An advanced version of Gemini Deep Think solved five out of the six IMO problems perfectly, earning 35 total points, and achieving gold-medal level performance.
[1] https://deepmind.google/discover/blog/advanced-version-of-gemini-with-deep-think-officially-achieves-gold-medal-standard-at-the-international-mathematical-olympiad/
AI Training (Score:2, Insightful)
I have to wonder. Did "Gemini Deep Think" solve the problems or simply regurgitate the answer from the billions of sucked up webpages, math research papers, etc. used to train the model? Actual competitors don't have the complete history of [1]https://math.stackexchange.com... [stackexchange.com] at their fingertips.
[1] https://math.stackexchange.com/
Re:AI Training (Score:4, Informative)
I'm a mathematician, so I may have some expertise here: Humans spend years training for the IMO so they functionally sucking all of that up. And the IMO problems themselves vary a lot. Even extremely bright people including professional mathematicians, would have trouble with some IMO problems. Mere regurgitation is insufficient to solve the problems.
Re: AI Training (Score:1)
Thank you
Re: (Score:3)
Yes, they are high school students, but the students who get gold medals are students who frequently started studying for the IMO in 9th or 10th grade, and sometimes even earlier. And yes, mathematicians aren't always going to see the trick that a given problem relies on. And it is true that IMO problems often involve tricks or approaches that one can study for. The problems are not research mathematics, and a lot of very good mathematicians never did well at the IMO as a young person. At the same time, so
Re: (Score:2)
But what about putting them into a Computer Algebra system? We have had these for decades now. In fact, I used one when I started my CS studies 35 years ago.
The very point of such a competition is to have a human do it, not a machine.
Re: (Score:3)
You could ask the same question about the contestants. They solved these problems because they trained. They recognized techniques and ideas similar to other problems they encountered, and they applied them to a new problem.
A computer that can do math? (Score:3)
What will they think of next?
Re: (Score:2)
Extremely unlikely. The Riemann Hypothesis has been thought about by more mathematicians than almost any other serious open problem. LLMs have some limited "creativity" in the sense that they can try to combine existing techniques, but they aren't really capable of inventing entire new techniques out of whole cloth, and it seems pretty clear that fundamentally new insights are needed to resolve RH.
And other AI engines? (Score:1)
I have in mind, the latest Grok 4 Heavy, was he tested? I didn't saw any information about other engines doing that test to compare.
So? (Score:2)
I have no doubt that Maple or any other decent Computer Algebra system could have done the same ... 30 years ago. Or Wolfram Alpha.
This is a completely meaningless stunt. The only purpose is to deceive the stupid about what these systems can do, or rather cannot do.
NOTHING OFFICIAL AT ALL (Score:1)
The IMO is pretty pissed off at Google because the results are embargoed from publication until the 28th of July.
Apparently Google did their own metrics and claimed the Gold-Medal. IMO is not happy,
[1]https://arstechnica.com/ai/202... [arstechnica.com]
But hey, good on Goog for demonstrating some level of success, whether IMO gives them a gold medal or not.
Shame on them for violating the publication embargo, but I haven't read that contract or those T&Cs so I only
judge by what IMO says and what Goog says.
[1] https://arstechnica.com/ai/2025/07/openai-jumps-gun-on-international-math-olympiad-gold-medal-announcement/
Re: (Score:3)
Oh, so you must have found it easy when you got your gold medal?
Re: (Score:2)
The point is we have a myriad of "tests that are hard for humans, but don't necessarily translate to anything vaguely useful". In academics, a lot of tests are only demanding of reasoning ability because the human has limited memory. Computers short on actual "reasoning" largely make up for it by having just mind boggling amounts of something more akin to recall than reasoning (it's something a bit weirder, but as far as analogies go, recall is closer).
It's kind of like bragging that your RC boat could ge