Dodgy Huawei chips nearly sunk DeepSeek's next-gen R2 model

(2025/08/14)

Reference: 1755201240
News link: https://www.theregister.co.uk/2025/08/14/dodgy_huawei_deepseek/
Source link:

Unhelpful Huawei AI chips are reportedly why Chinese model dev DeepSeek's next-gen LLMs are taking so long.

Following the industry rattling launch of DeepSeek R1 earlier this year, the Chinese AI darling faced pressure from government authorities to train the model's successor on Huawei's homegrown silicon, three unnamed sources have [1]told the Financial Times.

But after months of work and the help of an entire team of Huawei engineers, unstable chips, glacial interconnects, and immature software proved insurmountable for DeepSeek, which was apparently unable to complete a single successful training run.

[2]

The failure, along with challenges with data labeling, ultimately delayed the release of DeepSeek R2 as the company started anew, using Nvidia's H20 GPUs instead. The company has reportedly relegated Huawei's Ascend accelerators to inference duty.

[3]

[4]

The Register reached out to DeepSeek for comment but hadn't heard back at the time of publication.

Huawei's Ascend accelerators, in particular the Ascend 910C that powers the IT giant's CloudMatrix rack-scale compute [5]platform , have garnered considerable attention in recent months.

[6]

While we don't know exactly which revision of Huawei's chips DeepSeek was playing with, at least on paper, the Ascend 910C should have delivered better performance in training than Nvidia's H20.

The Ascend 910C offers more vRAM and more than twice the BF16 floating point performance, while falling slightly behind in memory bandwidth, something that's less of an issue for training than inference.

Despite this, training is a more complex endeavor than one chip and involves distributing some of humankind's most computationally intensive workloads across tens of thousands of chips. If any one component fails, you have to start over from the last checkpoint.

[7]

For this reason, it is not uncommon for newcomers to the AI chip space to focus on inference, where the blast radius is far smaller while they work out the kinks required to scale the tech up. Huawei is clearly moving in that direction with its CloudMatrix rack systems, which aim to simplify the deployment of large scale training clusters based on its chips.

[8]OpenAI's GPT-5 looks less like AI evolution and more like cost cutting

[9]Stacking up Huawei's rack-scale boogeyman against Nvidia's best

[10]A billion dollars' worth of Nvidia chips fell off a truck and found their way to China, report says

[11]How OpenAI used a new data type to cut inference costs by 75%

Given how heavily DeepSeek optimized its training stack around Nvidia's hardware, going so far as to train much of the original V3 model on which R1 was based at FP8, a switch to Ascend would have required some heavy retooling. In addition to using a completely different software stack, Huawei's Ascend accelerators don't support FP8 and would have had to rely on more memory intensive 16-bit data types.

Even considering the geopolitical statement that training a frontier model on homegrown Chinese silicon made, this strikes us as an odd concession.

One possibility is that DeepSeek was specifically trying to use Huawei's Ascend accelerators for the reinforcement learning phase of the model's training, which requires inferencing large quantities of tokens to imbue an existing base model with "reasoning" capabilities.

This might explain why the Financial Times article specifically mentions R2 rather than a V4 model. As we mentioned earlier, R1 was based on DeepSeek's earlier V3 model. We don't know yet which model R2 will be based on.

In any case, the news comes just days after Bloomberg reported that Chinese authorities had begun [12]discouraging model devs from using Nvidia's H20 accelerators, especially for sensitive government workloads. ®

Get our [13]Tech Resources

[1] https://www.ft.com/content/eb984646-6320-4bfe-a78d-a1da2274b092

[2] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2aJ5ccxQsUo37S8glt1tG4wAAAMQ&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0

[3] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aJ5ccxQsUo37S8glt1tG4wAAAMQ&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aJ5ccxQsUo37S8glt1tG4wAAAMQ&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[5] https://www.theregister.com/2025/07/29/huawei_rackscale_boogeyman/

[6] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aJ5ccxQsUo37S8glt1tG4wAAAMQ&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[7] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aJ5ccxQsUo37S8glt1tG4wAAAMQ&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[8] https://www.theregister.com/2025/08/13/gpt_5_cost_cutting/

[9] https://www.theregister.com/2025/07/29/huawei_rackscale_boogeyman/

[10] https://www.theregister.com/2025/07/24/nvidia_chips_china_whoops/

[11] https://www.theregister.com/2025/08/10/openai_mxfp4/

[12] https://www.theregister.com/2025/08/12/china_nvidia_h20/

[13] https://whitepapers.theregister.com/

beast666

These 'unnamed sources' and 'people familiar with the matter' are everywhere in the media these days...

IGotOut

Well many are worried about your boss throwing them out of a window.

News: 1755201240

Dodgy Huawei chips nearly sunk DeepSeek's next-gen R2 model