Dodgy Huawei Chips Nearly Sunk DeepSeek's Next-Gen R2 Model
(Thursday August 14, 2025 @11:30PM (BeauHD)
from the homegrown-silicon dept.)
- Reference: 0178678056
- News link: https://developers.slashdot.org/story/25/08/14/2247214/dodgy-huawei-chips-nearly-sunk-deepseeks-next-gen-r2-model
- Source link:
DeepSeek's development of its next-gen R2 AI model was severely delayed [1]after months of failed training attempts on Huawei's Ascend chips , which suffered from unstable hardware, slow interconnects, and immature software. The Register reports:
> Following the industry rattling launch of DeepSeek R1 earlier this year, the Chinese AI darling faced pressure from government authorities to train the model's successor on Huawei's homegrown silicon, three unnamed sources have [2]told the Financial Times . But after months of work and the help of an entire team of Huawei engineers, unstable chips, glacial interconnects, and immature software proved insurmountable for DeepSeek, which was apparently unable to complete a single successful training run. The failure, along with challenges with data labeling, ultimately delayed the release of DeepSeek R2 as the company started anew, using Nvidia's H20 GPUs instead. The company has reportedly relegated Huawei's Ascend accelerators to inference duty.
[1] https://www.theregister.com/2025/08/14/dodgy_huawei_deepseek/
[2] https://www.ft.com/content/eb984646-6320-4bfe-a78d-a1da2274b092
> Following the industry rattling launch of DeepSeek R1 earlier this year, the Chinese AI darling faced pressure from government authorities to train the model's successor on Huawei's homegrown silicon, three unnamed sources have [2]told the Financial Times . But after months of work and the help of an entire team of Huawei engineers, unstable chips, glacial interconnects, and immature software proved insurmountable for DeepSeek, which was apparently unable to complete a single successful training run. The failure, along with challenges with data labeling, ultimately delayed the release of DeepSeek R2 as the company started anew, using Nvidia's H20 GPUs instead. The company has reportedly relegated Huawei's Ascend accelerators to inference duty.
[1] https://www.theregister.com/2025/08/14/dodgy_huawei_deepseek/
[2] https://www.ft.com/content/eb984646-6320-4bfe-a78d-a1da2274b092
Unsurprising (Score:3)
by DrMrLordX ( 559371 )
Given that the first DeepSeek relied on foreign hardware, it should come as no surprise that v2 is similarly dependent.
DeepSeek has options (Score:2)
Aside from Huawei and Cambricon, Deepseek also has Innosilicon, Moore Threads and Liusang as alternatives.
I guess that, for the R3 model, they will conduct small scale training trials to see if any of those work for them, to stay within chinese chips for training.
And it would not surprise me that they can achieve sway in to telling the "winner" what features they need emphasized in a future roadmap.
JM2C
YMMV