News: 0179328864

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Gemini AI Solves Coding Problem That Stumped 139 Human Teams At ICPC World Finals (arstechnica.com)

(Wednesday September 17, 2025 @05:20PM (BeauHD) from the artificial-brain-power dept.)


An anonymous reader quotes a report from Ars Technica:

> Like the rest of its Big Tech cadre, Google has spent lavishly on developing generative AI models. Google's AI can clean up your text messages and summarize the web, but the company is constantly looking to prove that its generative AI has true intelligence. The International Collegiate Programming Contest (ICPC) helps make the point. Google says Gemini 2.5 participated in the 2025 ICPC World Finals, [1]turning in a gold medal performance . According to Google this marks "a significant step on our path toward artificial general intelligence."

>

> Every year, thousands of college-level coders participate in the ICPC event, facing a dozen deviously complex coding and algorithmic puzzles over five grueling hours. This is the largest and longest-running competition of its type. To compete in the ICPC, Google connected Gemini 2.5 Deep Think to a remote online environment approved by the ICPC. The human competitors were given a head start of 10 minutes before Gemini began "thinking."

>

> According to Google, it did not create a freshly trained model for the ICPC like it did for the similar International Mathematical Olympiad (IMO) earlier this year. The Gemini 2.5 AI that participated in the ICPC is the same general model that we see in other Gemini applications. However, it was "enhanced" to churn through thinking tokens for the five-hour duration of the competition in search of solutions. At the end of the time limit, Gemini managed to get correct answers for 10 of the 12 problems, which earned it a gold medal. Only four of 139 human teams managed the same feat. "The ICPC has always been about setting the highest standards in problem-solving," said ICPC director Bill Poucher. "Gemini successfully joining this arena, and achieving gold-level results, marks a key moment in defining the AI tools and academic standards needed for the next generation."

Gemini's solutions are [2]available on GitHub .



[1] https://arstechnica.com/google/2025/09/google-gemini-earns-gold-medal-in-icpc-world-finals-coding-competition/

[2] https://github.com/google-deepmind/gemini_icpc2025



It's great at solving small hard problems. (Score:1)

by Seven Spirals ( 4924941 )

It's terrible at creating big ugly applications. Again, I'd assert this guy is right. [1]Where's the Shovelware? Why AI Coding Claims Don't Add Up [substack.com].

[1] https://mikelovesrobots.substack.com/p/wheres-the-shovelware-why-ai-coding

Re: (Score:3)

by ndsurvivor ( 891239 )

My conclusion from reading the headline, is that most young coders don't know bits from bytes, and could not add 0110 to 0010 in their heads if their life depended on it.

Re: (Score:3)

by Mr. Barky ( 152560 )

0120 :)

Re: (Score:2)

by ndsurvivor ( 891239 )

:-) 420? *giggles*

Yeah right (Score:4, Funny)

by backslashdot ( 95548 )

Give it UI problems.

Computers are fast. News at 11. (Score:2)

by SpinyNorman ( 33776 )

> However, it was "enhanced" to churn through thinking tokens for the five-hour duration of the competition in search of solutions.

If you read the comments on the linked story, one is from a competitor from a prior years competition who notes that his competition always has a "time sink" problem that smart humans will steer clear of unless that have solved everything else.

Apparently it took Gemini 30 minutes of solve this one time sink problem "C". The article doesn't say what hardware Gemini was running

A lot of training here - still impressive (Score:4, Insightful)

by TheMiddleRoad ( 1153113 )

The general model has been thoroughly trained on these types of problems. Then they tweaked it for the specific challenge. Then they ran it with tons of processing power, more than any normal person gets. And all of this was for very, very, very specific types of coding problems.

[1]https://worldfinals.icpc.globa... [worldfinals.icpc.global]

It's not intelligence. It's processing.

[1] https://worldfinals.icpc.global/problems/2025/finals/problems/A-askewedreasoning.pdf

Sounds like an advert (Score:2)

by wakeboarder ( 2695839 )

to me.

Impressive. (Score:2)

by Gravis Zero ( 934156 )

The coding part of this isn't of any particular interest, what is interesting is that it solved a complex logic optimization problem: [1]take a look [worldfinals.icpc.global]

That said, this seems more like the kind of problem you would throw at mathematicians. While most real world applications are unlikely to have neat and tidy solutions, optimization problems like this really do exist. Being able to get quick solutions to these kind of complex optimization problem would radically reduce the number of people needed to solve such a pro

[1] https://worldfinals.icpc.global/problems/2025/finals/problems/C-brideofpipestream.pdf

Re: (Score:2)

by viperidaenz ( 2515578 )

Looks like all you need to do to solve is is build a simple model, and run all 600 million combinations of input through it and the answer is the highest percentage

Perfect job for a computer with massive compute resource.

The other contestants probably don't have the computing power to do that, so would have to solve it by figuring out a mathematical proof. Or skip it and solve the other problems instead.

glorified chess computer (Score:2)

by flibbidyfloo ( 451053 )

Computers have been able to beat 99% of the population at chess for quite a while.

Beating a bunch of college-level coders at coding isn't any more a sign of "general intelligence" than was my Amiga 500 being able to checkmate me 35 years ago.

Correction (Score:1)

by greytree ( 7124971 )

Headline: "Gemini AI Solves Coding Problem That Stumped 139 Human Teams"

Story: "After 677 minutes, Gemini 2.5 Deep Think had 10 correct answers, securing a second-place finish among the university teams."

Corrected headline: "Gemini AI Comes second to human team in coding problems"

Re: (Score:2)

by evanh ( 627108 )

You should've led with the last sentence.

Re: (Score:2)

by jythie ( 914043 )

So it had 10 right answers... how many wrong, and could it tell the difference?

Re: (Score:2)

by Pascoea ( 968200 )

> No one can explain how any given LLM works, not in any detail. For all practical purposes, it's a black box that generates weird quasi-intelligent results.

So closer to humans than we think?

Burned through tokens at an enhanced rate? (Score:2)

by glowworm ( 880177 )

I wonder if the person driving it had a PhD to steer and guide and fix up code hallucinations as it "burned through tokens" for five hours to achieve this college level programming task.

I also wonder just how many tokens were burned? That is a cost thing after all.

And if not actively vibe steered by an experienced coder supervising as they burnt tokens for 5 hours at an enhanced rate, I would be impressed only if I saw the complete preprompting, and probably how many KwH of power/Litres of water it also c

Charles Briscoe-Smith <cpbs@debian.org>:
After all, the gzip package is called `gzip', not `libz-bin'...

James Troup <troup@debian.org>:
Uh, probably because the gzip binary doesn't come from the
non-existent libz package or the existent zlib package.
-- debian-bugs-dist