News: 0175245869

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Study Done By Apple AI Scientists Proves LLMs Have No Ability to Reason (appleinsider.com)

(Sunday October 13, 2024 @05:48PM (EditorDavid) from the being-reasonable dept.)


Slashdot reader [1]Rick Schumann shared this [2]report from the blog AppleInsider :

> A new paper from Apple's artificial intelligence scientists has found that engines based on large language models, such as those from Meta and OpenAI, still lack basic reasoning skills.

>

> The group [3]has proposed a new benchmark, GSM-Symbolic, to help others measure the reasoning capabilities of various large language models (LLMs). Their initial testing reveals that slight changes in the wording of queries can result in significantly different answers, undermining the reliability of the models. The group investigated the "fragility" of mathematical reasoning by adding contextual information to their queries that a human could understand, but which should not affect the fundamental mathematics of the solution. This resulted in varying answers, which shouldn't happen...

>

> The study found that adding even a single sentence that appears to offer relevant information to a given math question can reduce the accuracy of the final answer by up to 65 percent. "There is just no way you can build reliable agents on this foundation, where changing a word or two in irrelevant ways or adding a few bit of irrelevant info can give you a different answer," the study concluded... "We found no evidence of formal reasoning in language models," the new study concluded. The behavior of LLMS "is better explained by sophisticated pattern matching" which the study found to be "so fragile, in fact, that [simply] changing names can alter results."



[1] https://www.slashdot.org/~Rick+Schumann

[2] https://appleinsider.com/articles/24/10/12/apples-study-proves-that-llm-based-ai-models-are-flawed-because-they-cannot-reason?utm_medium=rss

[3] https://arxiv.org/pdf/2410.05229



Duh (Score:2)

by locater16 ( 2326718 )

"Reasoning" was never part of the fundamental LLM model. But if you brute force it enough it'll do something kinda cool, which is enough to get money, which is enough to get thousands upon thousands of people brute forcing it, fundamentals be damned.

Extract from Official Sweepstakes Rules:

NO PURCHASE REQUIRED TO CLAIM YOUR PRIZE

To claim your prize without purchase, do the following: (a) Carefully
cut out your computer-printed name and address from upper right hand
corner of the Prize Claim Form. (b) Affix computer-printed name and
address -- with glue or cellophane tape (no staples or paper clips) --
to a 3x5 inch index card. (c) Also cut out the "No" paragraph (lower
left hand corner of Prize Claim Form) and affix it to the 3x5 card
below your address label. (d) Then print on your 3x5 card, above your
computer-printed name and address the words "CARTER & VAN PEEL
SWEEPSTAKES" (Use all capital letters.) (e) Finally place 3x5 card
(without bending) into a plain envelope [NOTE: do NOT use the
Official Prize Claim and CVP Perfume Reply Envelope or you may be
disqualified], and mail to: CVP, Box 1320, Westbury, NY 11595. Print
this address correctly. Comply with above instructions carefully and
completely or you may be disqualified from receiving your prize.