News: 1762438812

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Microsoft: Don't let AI agents near your credit card yet

(2025/11/06)


Ready to have your agent talk to my agent and arrange a sale? Microsoft has published a simulated marketplace to put AI agents through their paces and answer a question for the new age: Would you trust AI with your credit card?

Customer-facing assistants are all the rage these days. OpenAI and Anthropic, for example, have helpers that will navigate websites and complete purchases. Then there are assistants that will aid sellers with customer engagement and operations.

It all points to a future where, like rich people with personal shoppers, the average user will have "people" to do all the work for them.

[1]

To simulate what might happen, Microsoft's researchers built the [2]Magentic Marketplace , an open-source simulation upon which agents can be unleashed and the results studied.

[3]

[4]

And the conclusion? "Agents should assist, not replace, human decision-making."

[5]Microsoft apologizes for not explaining cheaper no-AI M365 plans, and all it took was a government lawsuit

[6]Copilot can replace Search in latest Windows 11 test builds, but it's not a good idea

[7]Microsoft lets bosses spot teams that are dodging Copilot

[8]Microsoft threatens to ram Copilot into Exchange Server on-prem

The marketplace simulation manages catalogs of goods and services, and facilitates agent-to-agent communication. It also handles simulated payments. The researchers simulated transactions such as ordering food or engaging with home improvement services. Agents represented customers and businesses at each end of the transactions.

Each experiment was run using 100 virtual customers and 300 virtual businesses, and included both proprietary models (such as GPT-4o and Gemini-2.5-Flash) and open source models. The team had agents building queries, navigating results, and negotiating transactions.

The results were interesting. Although agents can help (the thinking is that an AI agent should be able to consider far more possibilities than a human could), loading them with more options and search results led to a decline in the number of comparisons. With some exceptions (notably Gemini-2.5-Flash and GPT-5), researchers found the models tended to accept the initial "good enough" options rather than dig deeper.

[9]

Researchers also tried manipulation strategies, which ranged from fake award credentials and fake reviews, to [10]prompt injections . Again, the models varied. Gemini-2.5-Flash proved generally resistant, while others could be tricked. Prompt injection techniques proved useful in directing payments to manipulative agents, while more basic persuasion techniques were also effective.

The researchers noted: "These findings highlight a critical security concern for agentic marketplaces."

It all suggests that the current state of the art in terms of AI models still has some ways to go. The agents were shown to struggle when presented with too many options and were vulnerable to manipulation. Researchers also found some models showed biases, including selecting a business based on its position in the results rather than on merit.

[11]

And then there is the design and implementation of the marketplace. The researchers said: "Our current study focused on static markets, but real-world environments are dynamic, with agents and users learning over time.

"Oversight is critical for high-stakes transactions."

"A simulation environment like Magentic Marketplace is crucial for understanding the interplay between market components and agents before deploying them at scale."

So, perhaps reconsider handing over authority to an agent at this point. The results might not be quite what you were expecting. ®

Get our [12]Tech Resources



[1] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2aQzUJiQViTQoRAj5W4WVtAAAAEQ&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0

[2] https://www.microsoft.com/en-us/research/blog/magentic-marketplace-an-open-source-simulation-environment-for-studying-agentic-markets/

[3] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aQzUJiQViTQoRAj5W4WVtAAAAEQ&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aQzUJiQViTQoRAj5W4WVtAAAAEQ&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[5] https://www.theregister.com/2025/11/06/microsoft_copilot_m365_apology/

[6] https://www.theregister.com/2025/11/04/microsoft_windows_copilot_search/

[7] https://www.theregister.com/2025/10/10/microsoft_copilot_viva_insights/

[8] https://www.theregister.com/2025/10/23/copilot_exchange_server/

[9] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aQzUJiQViTQoRAj5W4WVtAAAAEQ&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[10] https://www.theregister.com/2025/10/28/ai_browsers_prompt_injection/

[11] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aQzUJiQViTQoRAj5W4WVtAAAAEQ&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[12] https://whitepapers.theregister.com/



I'll ditch my credit card

Alan Bourke

before I let any form of this bollocks anywhere near it.

Headley_Grange

I suspect that efforts like this are more about MS finding out how best to exploit AI in these scenarios. Both MS and Amazon have launched legal cases recently because they can't (yet) have any confidence that their sales and marketing snake oil work on AIs used as agents by their human customers.

gnasher729

My AI bought a dozen 8TB SSDs for $20 each for me. Each with 4MB of storage.

Anonymous Coward

So much like real people then?

The average punter is too dim to go past the 1st page of the search results, thinks the top 3-5 results are real, not adverts, trusts the 4mln trust pilot reviews as an indication that the seller is legit, gets taken in by "too good to be true" offers, etc, etc.

Great, millions of kg of carbon and billions of [ your local currency denomination ] and we have sort of recreated the halfwits of society.

Trust, money and computing

DoctorNine

As I read this article, I couldn't help but notice that the company asking us to trust their advice is Microsoft. No, M$, I do not trust AI with my credit card. And I didn't require advanced computer modeling to come to this clearly obvious conclusion. It seems clear that M$ would love to see a future where AI shopping agents are managed by a 'trusted' partner. They want to be that partner. Much the better to siphon data and subscription fees from, my dear. The big bad wolf in this fairy tale is telegraphing its intent. One would have to be mad to get in financial bed with it, given the decades of history and legal trouble M$ has had with honest corporate policy. No, I think I will continue shopping for myself, thank you. Even if Clippy 'just wants to help'.

Sharpening blade covers welded on to make kitchen knives safer

Flocke Kroes

Some strange and obscure website had an article about [1]Amazon being annoyed by an AI shopping agent . Presumably people are using the agent because the web site has been enshittified beyond usability. The answer is not to duct tape a false sense of security for credit card numbers onto AI shopping agents. Shop anywhere else.

[1] https://www.theregister.com/2025/11/05/amazon_perplexity_comet_legal_threat/

I wonder for the sanity of anyone

Neil Barnes

who would let alleged 'AI' agents execute the entire purchasing transaction.

Are we saying now that the AI is enough of a person that it can enter into contracts?

ObXKCD: https://xkcd.com/1807/

Re: I wonder for the sanity of anyone

VerySadGeek

Re: ObXKCD: https://xkcd.com/1807/

I tend to go with "Alexa what is coffee bean 100 in Welsh ?" or "Alexa what is choose my side in Afrikans ?"

The next 10 years

JimmyPage

Will be a steady growing list of things you *can't* use "AI" for.

Duh!!

Rich 2

Anyone who lets an “AI” thing buy stuff for them deserves what they get - which is almost certainly not what they wanted.

This is such a stupid fucking idea that I’m going for a lie down

The big cities of America are becoming Third World countries.
-- Nora Ephron