OpenAI Unveils o3, a Smarter AI Model With Improved Reasoning Skills (openai.com)

(Friday December 20, 2024 @05:50PM (msmash) from the AGI-race dept.)

Reference: 0175712691
News link: https://slashdot.org/story/24/12/20/1836246/openai-unveils-o3-a-smarter-ai-model-with-improved-reasoning-skills
Source link: https://openai.com/12-days/?day=12

OpenAI has [1]unveiled a new AI model that it says takes longer to solve problems but gets better results, following Google's similar announcement a day earlier. The model, called o3, replaces [2]o1 from September and spends extra time working through questions that need step-by-step reasoning.

It scores [3]three times higher than o1 on ARC-AGI , a test measuring how well AI handles complex math and logic problems it hasn't seen before. "This is the beginning of the next phase of AI," CEO Sam Altman said during a livestream Friday.

The Microsoft-backed startup is keeping o3 under wraps for now but plans to let outside researchers test it.

[1] https://openai.com/12-days/?day=12

[2] https://tech.slashdot.org/story/24/09/12/1717221/openai-releases-o1-its-first-model-with-reasoning-abilities

[3] https://www.wired.com/story/openai-o3-reasoning-model-google-gemini/

takes longer to solve problems (Score:2)

by rossdee ( 243626 )

But matters whether you get answer in microsecond rather than millisecond as long as correct?

-- Manuel Garcia O'Kelly Davis

Sam Altman (Score:1)

by nonBORG ( 5254161 )

I have zero trust in Altman, his name seems like a pseudonym also. If there was ever a guy I would not want to work for or work with it would be him and Bill Gates. But Sam Altman seems like just a bad feeling more than a list of things he has done wrong. Hopefully I will be proven wrong.

Re: (Score:2)

by gweihir ( 88907 )

Altman is like the scummy, slimy version of BG. And BG is pretty repulsive in what he did and who he is already.

Great, more lies (Score:2, Insightful)

by gweihir ( 88907 )

Still no "reasoning skills" in LLMs, no matter how much they lie about it. And hence no "smart" either. A very fundamental breakthrough would be required, but there is nothing. Not really surprising with this old tech that that was was just scaled up, trained with a massive piracy campaign and hat its interface prettified with decidedly non-intelligent NLP.

It is a mystery for me why so many people fall for these lies. Are people just too shallow to actually see what is going on? To me, whenever I ask AI som

Re: (Score:3)

by JoshuaZ ( 1134087 )

You are correct that LLMs are not great at search tasks. And their use for that is a poor choice. But they are really helpful at other tasks. Even just as better proof-readers they are useful. They also can help with programming. The current models which are widely available for example are better programmers than the bottom quartile of programmers, and allow people with close to zero programming skill to do programming. They aren't amazing; I work with a lot of talented and gifted high school students, and

Re: (Score:2, Troll)

by gweihir ( 88907 )

Exactly the other way round. Search is one of the few things they actually can do somewhat well. As to the "bottom quartile of programmers", you realize these people have massive _negative_ productivity, right? And so do LLMs.

Re: (Score:3)

by JoshuaZ ( 1134087 )

They don't have negative productivity. They have low productivity in many contexts, and especially low when their job is specifically just programming. But a lot of those people have programming adjacent jobs or jobs which occasionally require programming. That's exactly the people who benefit from something like this. And o3 is by all accounts even better than ChatGPT or GPT4 for almost all purposes. That means that the set of programmers it will help is even larger.

Re: (Score:2)

by war4peace ( 1628283 )

I don't know... lately I've been asking ChatGPT and Gemini quite a few things which would have required me to spend hours looking up. Yes, glorified search, but with summarization and near-real time search.

A very recent example, from a few days ago, when Slashdot threw a hissy fit at adblockers: I asked ChatGPT for "10 tech news from last 24 hours", and it provided a list with summarization, much like Slashdot. Then I asked to expand on item #3, I believe, and it did, then I asked it to provide me with URLs

Re: (Score:2)

by gweihir ( 88907 )

> I don't know... lately I've been asking ChatGPT and Gemini quite a few things which would have required me to spend hours looking up. Yes, glorified search, but with summarization and near-real time search.

Sure. But the claim here is "reasoning" and that is just a direct lie, nothing else.

> That doesn't make me stupid. Of course, I could have done all that myself, but at 100x the time spent, which I would rather spend doing something more productive. Convenience is a big feature of those tools.

Sure again. Just be aware that you may miss something you would have gotten otherwise. One thing is the search skill itself. Another is the information you usually find in the context of what you are looking for. If you are not careful, you can cripple your skills, make yourself dependent and limit your view on things to a serious degree. That does not mean to always do it yourself. Just occasional to make sure you still can

Re: Great, more lies (Score:1)

by BlueKitties ( 1541613 )

WELL WELL WELL. We meet again, my favorite CS department member. Alright here is an essay for you: It suggests o3 is qualitatively different than older generation LLMs. I look forward to your take on it.

[1]https://arcprize.org/blog/oai-... [arcprize.org]

[1] https://arcprize.org/blog/oai-o3-pub-breakthrough

Impressive but limited (Score:3)

by JoshuaZ ( 1134087 )

This model is extremely impressive in terms of what it can do. It is much better at logical reasoning and doing math than earlier models. However, given the massive computing power and energy use it entails, it is unlikely to be widely available any time soon. The compute is simply way too much. However, the general tendency since the introduction of GPT3 has been that the amount of compute it takes to get an LLM to run at a given quality level has been consistently going down as we figure out more ways to run them efficiently. Given that, my guess is that something like o3 will be widely available in 2 to 5 years.

Re: (Score:2)

by war4peace ( 1628283 )

Generally speaking, this is something that pisses me off.

X unveiled AI tool A - but it's not available yet.

Y unveiled AI tool B - but it's not available yet.

Z unveiled AI tool C - but, you guessed it, it's not available yet.

I could "unveil" anything too, with a couple pretty pictures and some curated examples, but as long as the product is not available, it's worth nothing.

News: 0175712691

OpenAI Unveils o3, a Smarter AI Model With Improved Reasoning Skills (openai.com)

takes longer to solve problems (Score:2)

Sam Altman (Score:1)

Re: (Score:2)

Great, more lies (Score:2, Insightful)

Re: (Score:3)

Re: (Score:2, Troll)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: Great, more lies (Score:1)

Impressive but limited (Score:3)

Re: (Score:2)