News: 0176698545

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Google Claims Gemma 3 Reaches 98% of DeepSeek's Accuracy Using Only One GPU

(Wednesday March 12, 2025 @11:30PM (BeauHD) from the two-can-play-that-game dept.)


Google says its new open-source AI model, Gemma 3, [1]achieves nearly the same performance as DeepSeek AI's R1 while using just one Nvidia H100 GPU, compared to an estimated 32 for R1. ZDNet reports:

> Using "Elo" scores, a common measurement system used to rank chess and athletes, Google claims Gemma 3 comes within 98% of the score of DeepSeek's R1, 1338 versus 1363 for R1. That means R1 is superior to Gemma 3. However, based on Google's estimate, the search giant claims that it would take 32 of Nvidia's mainstream "H100" GPU chips to achieve R1's score, whereas Gemma 3 uses only one H100 GPU.

>

> Google's balance of compute and Elo score is a "sweet spot," the company claims. In a [2]blog post , Google bills the new program as "the most capable model you can run on a single GPU or TPU," referring to the company's custom AI chip, the "tensor processing unit." "Gemma 3 delivers state-of-the-art performance for its size, outperforming Llama-405B, DeepSeek-V3, and o3-mini in preliminary human preference evaluations on LMArena's leaderboard," the blog post relates, referring to the Elo scores. "This helps you to create engaging user experiences that can fit on a single GPU or TPU host."

>

> Google's model also tops Meta's Llama 3's Elo score, which it estimates would require 16 GPUs. (Note that the numbers of H100 chips used by the competition are Google's estimate; DeepSeek AI has only disclosed an example of using 1,814 of Nvidia's less-powerful H800 GPUs to server answers with R1.) More detailed information is provided in a developer [3]blog post on HuggingFace, where the Gemma 3 repository is offered.



[1] https://www.zdnet.com/article/google-claims-gemma-3-reaches-98-of-deepseeks-accuracy-using-only-one-gpu/

[2] https://blog.google/technology/developers/gemma-3/

[3] https://huggingface.co/blog/gemma3



I was getting all excited (Score:1)

by LondoMollari ( 172563 )

I was getting all excited when I thought the article was talking about Gemma Chan⦠turns out it's just another generic AI bot.

Lol, Google has AI (Score:1)

by ebunga ( 95613 )

Please clap.

I think.. impressive (Score:2)

by ndsurvivor ( 891239 )

We all saw the AI revolution coming. It will get more and more efficient. However, I would prefer that they do not suck all of the juice out of our electrical system.

Re: (Score:2)

by ihadafivedigituid ( 8391795 )

Get a Mac and run it at home. My Macbook Pro consumes about 60 watts when doing inference, which is way less than the usual GPUs.

That's amazing! (Score:2)

by Kiliani ( 816330 )

Does this mean AI only needs a single brain cell??

Re:That's amazing! [The third attempt is better!] (Score:2)

by shanen ( 462549 )

I think you're going for funny and on that basis it deserved to be FP. However I think the significance of the story is pretty close to null. LOTS of room for optimization, though the claim of second system effect is that the biggest improvement is in the second round.

personal AI (Score:2)

by ZipNada ( 10152669 )

AI at that level these days has generally been something on the cloud that you pay fees to access. And it presumably has the entire history of your interaction with it, which is troubling. This improvement in efficiency (assuming true) makes it a lot easier for a modest-size corporation to contemplate owning the physical AI. It will result in faster proliferation of these machines. Let's hope we survive it.

Re: (Score:2)

by ndsurvivor ( 891239 )

I agree. Google is the example of a utility that spies on you. I would like an AI that is just mine. I think we are there.

Re: (Score:2)

by DamnOregonian ( 963763 )

We are there, but it's still a bit pricey- mostly due to VRAM requirements.

There are several models that run well on machines with 128GB of VRAM, which is a budding but existing market.

Re: (Score:2)

by DamnOregonian ( 963763 )

Better. An H100 has 80GB of VRAM.

Today, you can do that on a Mac, and soon you'll be able to do it with a Strix Halo. Probably not long from now, an Intel (assuming they can read the room)

This means not just a corporation- a person can do it.

Re: (Score:2)

by ndsurvivor ( 891239 )

a sucker? reminder that duckduckgo has anon links to AI's.

It comes within 98% of their score? (Score:1)

by outsider007 ( 115534 )

That's really not much of a claim, is it?

Come back when it's running on my iPhone6 (Score:2)

by thesjaakspoiler ( 4782965 )

Who has a H100 lying around?

Re: (Score:2)

by ndsurvivor ( 891239 )

I remember a time when people said: "who has a Z80 microprocessor laying around?".... well not really, but you get the point.

Re: (Score:2)

by DamnOregonian ( 963763 )

/me glances over at his calculator

/me slowly raises hand

Learned my first assembly language on that bad boy!

Re: (Score:2)

by ndsurvivor ( 891239 )

Of course. My respect to you.

Re: (Score:2)

by DamnOregonian ( 963763 )

Ya, H100 is still a steep ask. However, there are machines with more than an H100 worth of VRAM you can get your hands on.

M2 Max (Mac Studio, MacBook Pro), M2 Ultra (Mac Studio), M3 Max (Mac Studio, MacBook Pro), M3 Ultra (Mac Studio), M4 Max, (MacBook Pro).

Soon Strix Halo will be available if AMD is your thing.

On my M4 Max, I'm getting ~9t/s at FP16 and ~16t/s at Q8_0. Nice and usable.

At lower quantizations (Q4, etc) you could run it on top-of-the-line discretes (24GB VRAM, etc)

Re: (Score:2)

by ndsurvivor ( 891239 )

It does not seem to be the RAM that makes AI work. It seems to be the trillions of integer calculations the chip can do in a second. It seems like a statistical calculation as to what the next token should be.

Elephant, n.:
A mouse built to government specifications.