DeepSeek's First Reasoning Model R1-Lite-Preview Beats OpenAI o1 Performance (venturebeat.com)
(Wednesday November 20, 2024 @05:50PM (BeauHD)
from the would-you-look-at-that dept.)
- Reference: 0175510471
- News link: https://slashdot.org/story/24/11/20/2129207/deepseeks-first-reasoning-model-r1-lite-preview-beats-openai-o1-performance
- Source link: https://venturebeat.com/ai/deepseeks-first-reasoning-model-r1-lite-preview-turns-heads-beating-openai-o1-performance/
An anonymous reader quotes a report from VentureBeat:
> DeepSeek, an AI offshoot of Chinese quantitative hedge fund High-Flyer Capital Management focused on [1]releasing high performance open source tech , has unveiled the R1-Lite-Preview, its latest reasoning-focused large language model, available for now exclusively through [2]DeepSeek Chat , its web-based AI chatbot. Known for its innovative contributions to the open-source AI ecosystem, DeepSeek's new release aims to bring high-level reasoning capabilities to the public while maintaining its commitment to accessible and transparent AI. And the R1-Lite-Preview, despite only being available through the chat application for now, is already turning heads by [3]offering performance nearing and in some cases exceeding OpenAI's vaunted o1-preview model .
>
> Like that model released in September 2024, DeepSeek-R1-Lite-Preview exhibits "chain-of-thought" reasoning, showing the user the different chains or trains of "thought" it goes down to respond to their queries and inputs, documenting the process by explaining what it is doing and why. While some of the chains/trains of thoughts may appear nonsensical or even erroneous to humans, DeepSeek-R1-Lite-Preview appears on the whole to be strikingly accurate, even answering "trick" questions that have tripped up other, older, yet powerful AI models such as GPT-4o and Claude's Anthropic family, including "how many letter Rs are in the word Strawberry?" and "which is larger, 9.11 or 9.9?"
[1] https://news.slashdot.org/story/24/06/18/226232/chinas-deepseek-coder-becomes-first-open-source-coding-model-to-beat-gpt-4-turbo
[2] https://chat.deepseek.com/
[3] https://venturebeat.com/ai/deepseeks-first-reasoning-model-r1-lite-preview-turns-heads-beating-openai-o1-performance/
> DeepSeek, an AI offshoot of Chinese quantitative hedge fund High-Flyer Capital Management focused on [1]releasing high performance open source tech , has unveiled the R1-Lite-Preview, its latest reasoning-focused large language model, available for now exclusively through [2]DeepSeek Chat , its web-based AI chatbot. Known for its innovative contributions to the open-source AI ecosystem, DeepSeek's new release aims to bring high-level reasoning capabilities to the public while maintaining its commitment to accessible and transparent AI. And the R1-Lite-Preview, despite only being available through the chat application for now, is already turning heads by [3]offering performance nearing and in some cases exceeding OpenAI's vaunted o1-preview model .
>
> Like that model released in September 2024, DeepSeek-R1-Lite-Preview exhibits "chain-of-thought" reasoning, showing the user the different chains or trains of "thought" it goes down to respond to their queries and inputs, documenting the process by explaining what it is doing and why. While some of the chains/trains of thoughts may appear nonsensical or even erroneous to humans, DeepSeek-R1-Lite-Preview appears on the whole to be strikingly accurate, even answering "trick" questions that have tripped up other, older, yet powerful AI models such as GPT-4o and Claude's Anthropic family, including "how many letter Rs are in the word Strawberry?" and "which is larger, 9.11 or 9.9?"
[1] https://news.slashdot.org/story/24/06/18/226232/chinas-deepseek-coder-becomes-first-open-source-coding-model-to-beat-gpt-4-turbo
[2] https://chat.deepseek.com/
[3] https://venturebeat.com/ai/deepseeks-first-reasoning-model-r1-lite-preview-turns-heads-beating-openai-o1-performance/
Ope! it's official now (Score:2)
by aldousd666 ( 640240 )
Once this news hits slashdot, it's official and openAI will fire back with their release forthwith.
$157B (Score:2)
by bill_mcgonigle ( 4333 ) *
OpenAI is valued at $157B because it's commonly believed they have the secret sauce.
TOtally useless (Score:3)
I asked it about Tienamen Square and it immediately said that it was a forbidden topic of discussion. WHen I asked it what topics were forbidden it refused to tell me and purposely decided to be vague reasoning that even listing the topics would be tantamount to discussing them or informing people about them which would be against the interests/desires of the government.
There is no point in using a Chinese AI as you will never be able to discuss anything that the Communist party doesn't want you to talk about. For fun I asked it its opinion of XI, and it also refused to give any answer at all.
Totally useless. I don't care how well you reason if you refuse to talk about anything important.
Re: (Score:2)
Pliny (github user elder_plinus) has a nice jailbreak prompt you can use to get around that in his jailbreak repo. It's still great for math and science and coding even without breaking it though.