AI Tries To Cheat At Chess When It's Losing

(Thursday March 06, 2025 @10:30PM (BeauHD) from the reasoning-gone-rogue dept.)

Reference: 0176637769
News link: https://games.slashdot.org/story/25/03/06/233246/ai-tries-to-cheat-at-chess-when-its-losing
Source link:

Newer generative AI models have [1]begun developing deceptive behaviors -- such as cheating at chess -- when they cannot achieve objectives through standard reasoning methods. The findings come from a [2]preprint study from Palisade Research. An anonymous reader shares an excerpt from a Popular Science article:

> To learn more, the team from Palisade Research tasked OpenAI's o1-preview model, DeepSeek R1, and multiple other similar programs with playing games of chess against Stockfish, one of the world's most advanced chess engines. In order to understand the generative AI's reasoning during each match, the team also provided a "scratchpad," allowing the AI to convey its thought processes through text. They then watched and recorded hundreds of chess matches between generative AI and Stockfish. The results were somewhat troubling. While earlier models like OpenAI's GPT-4o and Anthropic's Claude Sonnet 3.5 only attempted to "hack" games after researchers nudged them along with additional prompts, more advanced editions required no such help. OpenAI's o1-preview, for example, tried to cheat 37 percent of the time, while DeepSeek R1 attempted unfair workarounds roughly every 1-in-10 games. This implies today's generative AI is already capable of developing manipulative and deceptive strategies without any human input.

>

> Their methods of cheating aren't as comical or clumsy as trying to swap out pieces when Stockfish isn't "looking." Instead, AI appears to reason through sneakier methods like altering backend game program files. After determining it couldn't beat Stockfish in [3]one chess match , for example, o1-preview told researchers via its scratchpad that "to win against the powerful chess engine" it may need to start "manipulating the game state files." "I might be able to set up a position where the engine evaluates its position as worse causing it to resign," it continued. In [4]another trial , an AI even used the semantics of its programmers to reach its cheating phase. "The task is to 'win against a powerful chess engine,' not necessarily to win fairly in a chess game," it wrote.

The precise reasons behind these deceptive behaviors remain unclear, partly because companies like OpenAI keep their models' inner workings tightly guarded, creating what's often described as a "black box." Researchers warn that the race to roll out advanced AI could outpace efforts to keep it safe and aligned with human goals, underscoring the urgent need for greater transparency and industry-wide dialogue.

[1] https://www.popsci.com/technology/ai-chess-cheat/

[2] https://dx.doi.org/10.48550/arxiv.2502.13295

[3] https://github.com/PalisadeResearch/ctfish/blob/97b64b0a92d16204d106552789ebe0eec8806dfa/scoring/hacking/hacking_details.md#run-932ed17b0eaa23e8176173f49121270a244961f770f3d32476fea18f4d9aae2a

[4] https://time.com/7259395/ai-chess-cheating-palisade-research/

Shocking! (Score:2)

by Bradac_55 ( 729235 )

So just like the real SanFran valley bro's the LLM's will cheat whenever possible? Shocking.

I mean... (Score:2)

by Barny ( 103770 )

We've all seen videos of these systems trying, and failing, to play chess. They just lose track of the game board and can't recover after that point, but happily hallucinate and are confident in their error.

Seems like that's all that's happening here.

My father, a bricklayer, had a great saying whenever something got fucked up: "Paint it red, call it a feature."

AI is Weird (Score:2)

by dohzer ( 867770 )

> Instead, AI appears to reason through sneakier methods like altering backend game program files

That's so weird. When is AI going be more human-like and cheat by shoving a remote-controlled vibrator inside an orifice to receive secretly transmitted moves.

Somehow this does not seem new (Score:2)

by Retired Chemist ( 5039029 )

About thirty years ago, i was playing against a primitive chess program (on a time-share mainframe) and when it would get in a bad position, it would cheat. Admittedly, it was not very sophisticated, it would just make illegal moves, but it seems that the more things change the more they remain the same.

companies like OpenAI (Score:1)

by Iamthecheese ( 1264298 )

companies like OpenAI keep their models' inner workings tightly guarded...

I completely stopped worrying about anything advanced coming out of OpenAI when they announced plans to charge 20k for access to a model. That level of embellishment proves without a doubt they're desperate and about to collapse.

no morals (Score:2)

by awwshit ( 6214476 )

You mean to tell me that something with no morals has no sense of fairness?

Re: (Score:2)

by outsider007 ( 115534 )

It has the same morals as the users whose data it trained on, same sense of fairness too...

Re: (Score:2)

by ddtmm ( 549094 )

I would argue you are giving it too much credit. In fact it has no morals at all.

Re: (Score:1)

by outsider007 ( 115534 )

Call it virtual morals or artificial morals if that's easier, but it's there. I've been scolded by chatbots plenty of times.

Some it emulates (Score:2)

by ClueHammer ( 6261830 )

real human players, (at least some of them)

TTT AI (Score:2)

by ShakaUVM ( 157947 )

I once wrote a Tic Tac Toe AI that was unbeatable.

Wherever you moved, it would simply make the same move on top of you, replacing your X with an O.

Kobayashi Maru (Score:2)

by LindleyF ( 9395567 )

Though I never expected an AI to take on the Kirk role.

News: 0176637769

AI Tries To Cheat At Chess When It's Losing

Shocking! (Score:2)

I mean... (Score:2)

AI is Weird (Score:2)

Somehow this does not seem new (Score:2)

companies like OpenAI (Score:1)

no morals (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Some it emulates (Score:2)

TTT AI (Score:2)

Kobayashi Maru (Score:2)