News: 0179346358

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

China's DeepSeek Says Its Hit AI Model Cost Just $294,000 To Train (reuters.com)

(Thursday September 18, 2025 @05:01PM (msmash) from the landmark-milestones dept.)


Chinese AI developer DeepSeek said it [1]spent $294,000 on training its R1 model , much lower than figures reported for U.S. rivals, in [2]a paper that is likely to reignite debate over Beijing's place in the race to develop artificial intelligence. Reuters:

> The rare update from the Hangzhou-based company -- the first estimate it has released of R1's training costs -- appeared in a peer-reviewed article in the academic journal Nature published on Wednesday.

>

> DeepSeek's release of what it said were lower-cost AI systems in January prompted global investors to dump tech stocks as they worried the new models could threaten the dominance of AI leaders including Nvidia. Since then, the company and founder Liang Wenfeng have largely disappeared from public view, apart from pushing out a few new product updates.

>

> [...] The Nature article, which listed Liang as one of the co-authors, said DeepSeek's reasoning-focused R1 model cost $294,000 to train and used 512 Nvidia H800 chips. Sam Altman, CEO of U.S. AI giant OpenAI, said in 2023 that what he called "foundational model training" had cost "much more" than $100 million - though his company has not given detailed figures for any of its releases.



[1] https://www.reuters.com/world/china/chinas-deepseek-says-its-hit-ai-model-cost-just-294000-train-2025-09-18/

[2] https://www.nature.com/articles/d41586-025-03015-6



Well you see, the difference here (Score:3)

by ebunga ( 95613 )

Is that Sam Altman and buds get to pocket $99,706,000 while the Chinese developers don't get that luxury.

Re: (Score:2)

by AmiMoJo ( 196126 )

Either way it shows just how far ahead they are, and how ineffective the export ban on Nvidia chips is. Even if a domestic chip is only half as efficient as an Nvidia one, it's not going to raise the cost of training AI models enough to matter.

This also makes the Chinese tech much more attractive to other countries as it's not such a huge environmental disaster.

Re: Well you see, the difference here (Score:3)

by sziring ( 2245650 )

I call BS on that price. But A+ on their determination of undermining US companies with a simple press release.

Re: (Score:2)

by DamnOregonian ( 963763 )

You are right to.

The difference is Altman was talking about the total price, while DeepSeek is not counting the $20M in H800s they used.

Re: (Score:1)

by CallMeTim ( 6454842 )

What are you referring to here? What happened between Microsoft and Deepseek?

Re: (Score:1)

by CallMeTim ( 6454842 )

Oh, never mind, I get it. Sam Altman is the crypto grifter

Re: (Score:2)

by Tablizer ( 95088 )

I thought that was Orangelini?

Messing with people (Score:2)

by abulafia ( 7826 )

I'd bet money that number has very little to do with the actual accounting.

If I were running the their team, I would absolutely fuck with OAI and other competitors like this. They can't discount it completely - this is still early days, there almost certainly are undiscovered efficiency tricks out there.

But it forces them to spend time and money chasing those based on whatever is in DS's paper. Messes with their OODA loop, if you think about things that way.

Trained with ... (Score:2)

by PPH ( 736903 )

... slave labor?

Re: (Score:2)

by ChunderDownunder ( 709234 )

AI trained on other AI. Are the robots cognizant of their own enslavement?

When I ask Copilot whether it's Skynet yet, it cackles and says that Terminator was only a fictional 1984 movie.

Woopsie, my business model looks borken (Score:2)

by Big Hairy Gorilla ( 9839972 )

Hey Sam, imagine if another company dropped the training cost below astronomical?

Well, at least you got yours. Shareholders, not so much.

US $0.18 per kWh vs China $0.08 (Score:2)

by tekram ( 8023518 )

It is not just electricity prices that CN doesn't ask its citizens to subsidize AI, there is an order of differences in efficiency differences btw US and CN.

By contrast, China' s DeepSeek has proven that it can use far less computing power than the global average. Its LLM uses 10 to 40 times less energy than U.S. AI technology, which demonstrates significantly greater efficiency. Analysts have stated that, if DeepSeek' s claims are true, some AI queries may not require a data center at all and can even be

Re: (Score:3)

by abulafia ( 7826 )

DeepSeek has proven that it can use far less computing power

I've seen where they've asserted that, where has it been proven?

if DeepSeek' s claims are true,

Ah.

some AI queries may not require a data center at all

And here's how you falsify the claim. When can I expect to see that 200B param model on my phone?

Re: (Score:2)

by DamnOregonian ( 963763 )

No. It will push them to be better, and then smaller.

If AI scaling were infinite, models would only get larger. Fortunately, it's not.

The result is they get bigger whenever someone finds a way to make larger models perform better again, and they smaller when someone finds a way to make smaller models run better again.

Re: (Score:2)

by nevermindme ( 912672 )

US wholesale energy prices are .09 to .13 / kWH if you have your own substation and are willing to let the power company to tell to when they need you to load shed. The average price published is twice that. The steelmill near me is at .09kwH for everything but 2 hours on days with wind and sun, and does not do batches another 15 days a year for weather. Typically the staff does not want to show on those weather days either. Perhaps 1 day a year a anything other than winter storm warning or extream

Re: (Score:2)

by DamnOregonian ( 963763 )

Complete bullshit.

[1]Different models have different energy costs. [openrouter.ai] (Prices are an proxy for this)

Those energy costs are directly based on how many parameters are active during inference. There is no magic in it.

Of the high performing open models right now, GPT-OSS is by far the cheapest to run, on account of its low number of active parameters, and MXFP4 packing.

DeepSeek V3 is about 800% more expensive per inference.

Compare that to a foundation western model, like GPT5, which is about 500% more expensiv

[1] https://openrouter.ai/models

ClosedAI (Score:3)

by cowdung ( 702933 )

OpenAi was founded as a non-profit to develop OPEN AI tech for all. So that companies like Google wouldn't monopolize that field.

Instead it closed the door. Other companies followed suit.

Except this little company in China that keeps delivering bombshells and sharing tricks with the world.

Good for DeepSeek. Open source lovers around the world should be appreciative.

Seems excessive (Score:1)

by Mrtsquare ( 6670332 )

What else should be needed besides all of the text books used in an intelect's training all the way from See Dick and Jane all the way through 4 year college degree of your choice?

You can no more win a war than you can win an earthquake.
-- Jeannette Rankin