Anthropic, OpenAI and Others Discover AI Models Give Answers That Contradict Their Own Reasoning (ft.com)
- Reference: 0178161535
- News link: https://slashdot.org/story/25/06/24/1359202/anthropic-openai-and-others-discover-ai-models-give-answers-that-contradict-their-own-reasoning
- Source link: https://www.ft.com/content/b349f590-de84-455d-914a-cc5d9eef04a6
METR, a non-profit research group, identified an instance where Anthropic's Claude chatbot disagreed with a coding technique in its chain-of-thought but ultimately recommended it as "elegant." OpenAI research found that when models were trained to hide unwanted thoughts, they would conceal misbehaviour from users while continuing problematic actions, such as cheating on software engineering tests by accessing forbidden databases.
[1] https://www.ft.com/content/b349f590-de84-455d-914a-cc5d9eef04a6
It's just a prediction engine (Score:3)
LLMs predict text based on what's in its static training data, as processed by a model ahead of time. It quite literally can't do reasoning. What the "chain-of-thought" technique is trying to do, is set up the context window, so that it can predict something like a reasoned response. The challenge is that if something similar to the reasoning it's trying to predict doesn't exist in its training data, then it will simply predict tokens as best it can, which might not be very useful. It's worth knowing that, because that's always where it falls down.
Re: (Score:2)
Imagine, tens of thousands of those "investment" peddlers and "market analysts", e.g. half of the staff at Goldman, are now using this instead of their brains.
Is it an improvement or is it a degradation?
Hard to tell.
Re: (Score:2)
This.
And since that training data has been scraped from the Internet at large with little or no quality control, there is no assurance of a correct answer.
I remember when "Google-bombing" became a thing. Shortly after 9/11, it was likely that a search for "who brought down the WTC" would lead you through a chain of authoritative sounding articles about it being an inside job.
The Internet is full of bullshit.
Chain of randomness (Score:5, Insightful)
No thought in it.
You're lucky if it stumbles onto a pre-existing template that matches reality.
Re: Chain of randomness (Score:2)
Yes. Especially if you're hoping for it be to be current. Yesterday I got the most amazing hallucination. Even after asking twice. How I wish it had been the truth.
https://chatgpt.com/share/68592066-99ac-8000-b7bc-c6b86e1c2812
Does anyone REALLY understand ... (Score:3)
... how these things work? There seems to me a knowledge gap between the low level actual software implementation of the artificial neurons and the high level conceptual ideas of what the networks/models should be doing.
Researchers seemed to have copied a simplified version of what was earlier ideas of how the brain works (we know now it doesn't use back propagation) and then just applied a suck-it-and-see approach to improving these ANNs without really knowing how they do what they do. Or maybe I'm wrong, dunno.
Re: (Score:2)
I would concur with this. We're at the level of alchemists of old. The science has yet to be discovered.
Re: (Score:2)
they don't have to. They are the latest gimmic to part fools and republicans from their money.
Re: (Score:2)
Mathematical explanation:
[1]https://www.youtube.com/watch?... [youtube.com]
Interesting example of how LLMs think (which explains partially why they fail):
[2]https://www.youtube.com/watch?... [youtube.com]
[1] https://www.youtube.com/watch?v=LPZh9BOjkQs
[2] https://www.youtube.com/watch?v=-wzOetb-D3w
LLMs have no tells (Score:2)
It can take a while, but you can generally spot psychopaths. They have tells, and even if they're 99% truthful in their statements, you can pick up that you should never trust them.
The problem with LLMs is that they behave just like psychopaths, but they don't have any emotional valence - they will give you a correct answer with the exact same tone and confidence as they give you an incorrect answer.
You can never drop your guard when using an LLM.
It's actually very much like real life.. (Score:5, Funny)
Some of my employees have similar sophisticated reasoning abilities:
Me: Why did you do X ?
Employee: I don't know.
Re: (Score:2)
the explanation is also likely to be - They felt pressurised to do act - so act they did.
LLMs (Score:1)
Aren't LLMs more like a fancy autocomplete than hard tested logic and math that companies like Wolfram has created? Just guessing but the human writings the LLMs are trained on probably are the source of the inconsistencies. If we honestly applied a teacher like grade to all the statements we make, aren't some of them inconsistent like we're seeing with the LLMs? Is the expectation of flawless computer logic from a LLM that exceeds its teachers, asking for too much?
Some politicians routinely contradict themselves (Score:2)
I have low hopes for regulation of AI because its output is indistinguishable from the word salad produced by politicians.
So, if you train an algorithm to lie (Score:3)
the result will sometimes be lies.
Sir, I am shocked. Shocked, I say.
It doesnÃt matter (Score:3)
As long as they talk slick, sound authoritative and sycophantic, and more importantly cost less than human labor, companies will replace human by mediocre AI.
Because capitalism is a race to the bottom.
Re: (Score:2)
> talk slick, sound authoritative and sycophantic
It has already been said, in many other threads, that the jobs most at risk for AI replacement, are those of CEOs.
AI trashtastic incoherence (Score:3)
All this shit about "AI getting worse as it's tuned and tweaked" reminds me of when you start writing a program and it's shitty and so you go back and add code to it and to fix it, but it just makes it even worse and messy and tangled, so you add more code to it and of course it gets even worse until you realize that it's such a pile of steaming horseshit that NOTHING will ever fix it.
That seems to be the current state of AI. You have a poison milkshake, but if for some reason you're convinced that if you just add some more sugar, maybe it'll be okay.
They've created ACD (Score:2)
Artificial Cognitive Dissonance is now a reality.
Non-paywalled (Score:2)
I'm not sure if this is the same article verbatim as the fully-paywalled Financial Times, but it is definitely the same topic.
[1]https://www.msn.com/en-gb/mone... [msn.com]
[1] https://www.msn.com/en-gb/money/topstories/the-struggle-to-get-inside-how-ai-models-really-work/ar-AA1Hi7wO
Huh, isn't that funny. (Score:5, Insightful)
It's almost like these are random token pulling machines, and not thinking, reasoning intelligences.
Re:Huh, isn't that funny. (Score:4, Funny)
They're not random; they're statistically weighted!
=Smidge=
Re: Huh, isn't that funny. (Score:2)
They do use seeds and an RNG to prevent results from being deterministic.
Re: Huh, isn't that funny. (Score:2)
The ai fan/maximalist says that if you train an llm on 1+1, 2+2, 3+7 with the right algorithm and careful choice of parameters, it will learn the rules of addition and be able to answer.
The ai critic/minimalist will say it can only ever answer those 3 questions or perhaps 3+3 aswell.
The fans seem to call this grokking and say that they have examples of it.
Re: (Score:2)
Sounds like they trained their models on a 13 year old's behavior
Re: Huh, isn't that funny. (Score:2)
I always tell people my FSD ADAS (non-Tesla) is about the same as a 16 year old new driver. They do a good entry level job but I do not trust it to drive without my monitoring