DeepSeek's new V3.1 release points to potent new Chinese chips coming soon
(2025/08/22)
- Reference: 1755826146
- News link: https://www.theregister.co.uk/2025/08/22/deepseek_v31_chinese_chip_hints/
- Source link:
Chinese AI darling DeepSeek unveiled an update to its flagship large language model that the company claims is already optimized for use with a new generation of homegrown silicon.
According to DeepSeek, it [1]trained the new V3.1 model using the UE8M0 data type, scaling the FP8 format that's already [2]supported by the likes of Nvidia.
In a WeChat [3]comment , the org clarified that the change was made in anticipation of a new generation of silicon. "UE8M0 FP8 is designed for the next generation of domestically produced chips to be released soon," the company wrote.
[4]
Lower-precision data types offer several benefits, including reduced memory consumption and higher throughput for both inference and training. However, it's worth noting DeepSeek was already using FP8, specifically the E4M3 type. As such, the switch to UE8M0 appears to be more about compatibility than efficiency.
[5]
[6]
DeepSeek hasn't named the source of the chips its new model can use, but the AI startup has reportedly been working closely with Huawei on training and inference using its Ascend family of neural processing units (NPUs).
Huawei's Ascend 910C, which powers its CloudMatrix rack systems we [7]looked at last month, doesn't support FP8 natively, suggesting the IT giant may have even more powerful accelerators on the way.
[8]
Last week, it was reported that DeepSeek had [9]attempted to train its next-gen R2 model on Huawei's Ascend accelerators but struggled to make them work and reverted to using Nvidia H20 accelerators. DeepSeek is now said to be evaluating Huawei's accelerators for inference duty.
It's not clear whether or not the so-called R2 refers to the V3.1 model released this week or a forthcoming model.
Not really so new
DeepSeek V3.1 isn't really a new model. It was trained from an earlier V3 checkpoint.
Despite this, the LLM does promise notable improvements. With V3.1, DeepSeek is no longer differentiating between its "thinking" and "non-thinking" models. V3.1 supports both paradigms in a single model and uses a pair of chat templates to toggle between the two. As such, the company’s chatbot interface now omits any reference to R1.
The idea of a unified model capable of reasoning and non reasoning outputs isn't new. Alibaba [10]attempted something like this earlier this year but abandoned the idea after finding the functionality degraded the quality of its Qwen 3 models.
[11]
At least in benchmarking, DeepSeek's V3.1 appears to have avoided that problem. Compared to V3, the point release's non-thinking model achieved significant gains across the board.
[12]
Here's how DeepSeek says its new hybrid reasoning model compares to R1 - Click to enlarge
With thinking enabled, the model's gains were more modest. However that doesn't quite tell the full story, as DeepSeek notes that the model now requires far fewer thinking tokens to arrive at an answer than before, which should help to cut costs associated with serving the model.
Speaking of tokens, DeepSeek has boosted the number of tokens in its context window, which you can think of as its short-term memory, from 65,536 to 131,072. While a significant improvement, that still trails other Chinese models like Qwen3, which can handle million-token contexts.
[13]Dodgy Huawei chips nearly sunk DeepSeek's next-gen R2 model
[14]How OpenAI used a new data type to cut inference costs by 75%
[15]Stacking up Huawei's rack-scale boogeyman against Nvidia's best
[16]A billion dollars' worth of Nvidia chips fell off a truck and found their way to China, report says
DeepSeek also boasted of significant gains in tool and function calling capabilities crucial for agentic AI workloads where external tools and data must be retrieved on the fly.
For example, in Browsecomp, a benchmark aimed at autonomous browser use tasks, DeepSeek v3.1 achieved a score of 30 where the May refresh of R1 managed a score of 8.9.
Along with access via its Chatbot service and API endpoint, DeepSeek has also made the mode weights for both the base and instruct-tuned models available for download on [17]Hugging Face and [18]ModeScope . ®
Get our [19]Tech Resources
[1] https://api-docs.deepseek.com/news/news250821
[2] https://docs.nvidia.com/cuda/parallel-thread-execution/#alternate-floating-point-data-formats
[3] https://mp.weixin.qq.com/s/WUbmBSapVyvxZe6HobD5Qw
[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2aKfrWzAeBIxAZGLNCQTupAAAAE0&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0
[5] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aKfrWzAeBIxAZGLNCQTupAAAAE0&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[6] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aKfrWzAeBIxAZGLNCQTupAAAAE0&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[7] https://www.theregister.com/2025/07/29/huawei_rackscale_boogeyman/
[8] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aKfrWzAeBIxAZGLNCQTupAAAAE0&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[9] https://www.theregister.com/2025/08/14/dodgy_huawei_deepseek/
[10] https://www.theregister.com/2025/07/31/alibaba_qwen3_hybrid_thinking/
[11] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aKfrWzAeBIxAZGLNCQTupAAAAE0&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[12] https://regmedia.co.uk/2025/08/21/deepseek_v3_1_perf.jpg
[13] https://www.theregister.com/2025/08/14/dodgy_huawei_deepseek/
[14] https://www.theregister.com/2025/08/10/openai_mxfp4/
[15] https://www.theregister.com/2025/07/29/huawei_rackscale_boogeyman/
[16] https://www.theregister.com/2025/07/24/nvidia_chips_china_whoops/
[17] https://huggingface.co/deepseek-ai
[18] https://modelscope.cn/organization/deepseek-ai
[19] https://whitepapers.theregister.com/
According to DeepSeek, it [1]trained the new V3.1 model using the UE8M0 data type, scaling the FP8 format that's already [2]supported by the likes of Nvidia.
In a WeChat [3]comment , the org clarified that the change was made in anticipation of a new generation of silicon. "UE8M0 FP8 is designed for the next generation of domestically produced chips to be released soon," the company wrote.
[4]
Lower-precision data types offer several benefits, including reduced memory consumption and higher throughput for both inference and training. However, it's worth noting DeepSeek was already using FP8, specifically the E4M3 type. As such, the switch to UE8M0 appears to be more about compatibility than efficiency.
[5]
[6]
DeepSeek hasn't named the source of the chips its new model can use, but the AI startup has reportedly been working closely with Huawei on training and inference using its Ascend family of neural processing units (NPUs).
Huawei's Ascend 910C, which powers its CloudMatrix rack systems we [7]looked at last month, doesn't support FP8 natively, suggesting the IT giant may have even more powerful accelerators on the way.
[8]
Last week, it was reported that DeepSeek had [9]attempted to train its next-gen R2 model on Huawei's Ascend accelerators but struggled to make them work and reverted to using Nvidia H20 accelerators. DeepSeek is now said to be evaluating Huawei's accelerators for inference duty.
It's not clear whether or not the so-called R2 refers to the V3.1 model released this week or a forthcoming model.
Not really so new
DeepSeek V3.1 isn't really a new model. It was trained from an earlier V3 checkpoint.
Despite this, the LLM does promise notable improvements. With V3.1, DeepSeek is no longer differentiating between its "thinking" and "non-thinking" models. V3.1 supports both paradigms in a single model and uses a pair of chat templates to toggle between the two. As such, the company’s chatbot interface now omits any reference to R1.
The idea of a unified model capable of reasoning and non reasoning outputs isn't new. Alibaba [10]attempted something like this earlier this year but abandoned the idea after finding the functionality degraded the quality of its Qwen 3 models.
[11]
At least in benchmarking, DeepSeek's V3.1 appears to have avoided that problem. Compared to V3, the point release's non-thinking model achieved significant gains across the board.
[12]
Here's how DeepSeek says its new hybrid reasoning model compares to R1 - Click to enlarge
With thinking enabled, the model's gains were more modest. However that doesn't quite tell the full story, as DeepSeek notes that the model now requires far fewer thinking tokens to arrive at an answer than before, which should help to cut costs associated with serving the model.
Speaking of tokens, DeepSeek has boosted the number of tokens in its context window, which you can think of as its short-term memory, from 65,536 to 131,072. While a significant improvement, that still trails other Chinese models like Qwen3, which can handle million-token contexts.
[13]Dodgy Huawei chips nearly sunk DeepSeek's next-gen R2 model
[14]How OpenAI used a new data type to cut inference costs by 75%
[15]Stacking up Huawei's rack-scale boogeyman against Nvidia's best
[16]A billion dollars' worth of Nvidia chips fell off a truck and found their way to China, report says
DeepSeek also boasted of significant gains in tool and function calling capabilities crucial for agentic AI workloads where external tools and data must be retrieved on the fly.
For example, in Browsecomp, a benchmark aimed at autonomous browser use tasks, DeepSeek v3.1 achieved a score of 30 where the May refresh of R1 managed a score of 8.9.
Along with access via its Chatbot service and API endpoint, DeepSeek has also made the mode weights for both the base and instruct-tuned models available for download on [17]Hugging Face and [18]ModeScope . ®
Get our [19]Tech Resources
[1] https://api-docs.deepseek.com/news/news250821
[2] https://docs.nvidia.com/cuda/parallel-thread-execution/#alternate-floating-point-data-formats
[3] https://mp.weixin.qq.com/s/WUbmBSapVyvxZe6HobD5Qw
[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2aKfrWzAeBIxAZGLNCQTupAAAAE0&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0
[5] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aKfrWzAeBIxAZGLNCQTupAAAAE0&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[6] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aKfrWzAeBIxAZGLNCQTupAAAAE0&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[7] https://www.theregister.com/2025/07/29/huawei_rackscale_boogeyman/
[8] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aKfrWzAeBIxAZGLNCQTupAAAAE0&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[9] https://www.theregister.com/2025/08/14/dodgy_huawei_deepseek/
[10] https://www.theregister.com/2025/07/31/alibaba_qwen3_hybrid_thinking/
[11] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aKfrWzAeBIxAZGLNCQTupAAAAE0&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[12] https://regmedia.co.uk/2025/08/21/deepseek_v3_1_perf.jpg
[13] https://www.theregister.com/2025/08/14/dodgy_huawei_deepseek/
[14] https://www.theregister.com/2025/08/10/openai_mxfp4/
[15] https://www.theregister.com/2025/07/29/huawei_rackscale_boogeyman/
[16] https://www.theregister.com/2025/07/24/nvidia_chips_china_whoops/
[17] https://huggingface.co/deepseek-ai
[18] https://modelscope.cn/organization/deepseek-ai
[19] https://whitepapers.theregister.com/