News: 0180925118

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

How Anthropic's Claude Helped Mozilla to Improve Firefox's Security (yahoo.com)

(Saturday March 07, 2026 @05:16PM (EditorDavid) from the better-browsers dept.)


"It took Anthropic's most advanced artificial-intelligence model about 20 minutes to find its first Firefox browser bug during an internal test of its hacking prowess," [1]reports the Wall Street Journal .

> The Anthropic team submitted it, and Firefox's developers quickly wrote back: This bug was serious. Could they get on a call? "What else do you have? Send us more," said Brian Grinstead, an engineer with Mozilla, Firefox's parent organization.

>

> Anthropic did. Over a two-week period in January, Claude Opus 4.6 found more high-severity bugs in Firefox than the rest of the world typically reports in two months, Mozilla said... In the two weeks it was scanning, Claude discovered more than 100 bugs in total, 14 of which were considered "high severity..." Last year, Firefox patched 73 bugs that it rated as either high severity or critical.

A [2]Mozilla blog post calls Firefox "one of the most scrutinized and security-hardened codebases on the web. Open source means our code is visible, reviewable, and continuously stress-tested by a global community." So they're impressed — and also thankful Anthropic provided test cases "that allowed our security team to quickly verify and reproduce each issue."

> Within hours, our platform engineers began landing fixes, and we kicked off a tight collaboration with Anthropic to apply the same technique across the rest of the browser codebase... . A number of the lower-severity findings were assertion failures, which overlapped with issues traditionally found through fuzzing, an automated testing technique that feeds software huge numbers of unexpected inputs to trigger crashes and bugs. However, the model also identified distinct classes of logic errors that fuzzers had not previously uncovered...

>

> We view this as clear evidence that large-scale, AI-assisted analysis is a powerful new addition in security engineers' toolbox. Firefox has undergone some of the most extensive fuzzing, static analysis, and regular security review over decades. Despite this, the model was able to reveal many previously unknown bugs. This is analogous to the early days of fuzzing; there is likely a substantial backlog of now-discoverable bugs across widely deployed software.

"In the time it took us to validate and submit this first vulnerability to Firefox, Claude had already discovered fifty more unique crashing inputs" in 6,000 C++ files, Anthropic [3]says in a blog post (which points out they've also used Claude Opus 4.6 to discover vulnerabilities in the Linux kernel).

"Anthropic "also rolled out [4]Claude Code Security , an automated code security testing tool, last month," [5]reports Axios , noting the move briefly [6]rattled cybersecurity stocks ...



[1] https://finance.yahoo.com/news/send-us-more-anthropic-claude-103000002.html

[2] https://blog.mozilla.org/en/firefox/hardening-firefox-anthropic-red-team/

[3] https://www.anthropic.com/news/mozilla-firefox-security

[4] https://www.anthropic.com/news/claude-code-security

[5] https://www.axios.com/2026/03/06/anthropic-mozilla-claude-opus-bug-hunting

[6] https://www.axios.com/2026/02/23/cyber-stocks-anthropic-sell-off



Seems like a good use of this technology (Score:3)

by sarren1901 ( 5415506 )

Given how these AIs are trained on massive amounts of real world examples, it makes sense that if you give it any given code base, it's going to find kind of errors given what it can compare. AI seems to be really good at pattern recognition and we should definitely lean into that.

Defensively speaking, we need all the help we can get in finding and eliminating bugs in our software. Given it's always been easier to break things as opposed to fixing them, the rate at which AI can be used to write malicious code likely out-paces our ability to find and fix the code. The never ending game.

Re: (Score:2)

by Kisai ( 213879 )

It's fine if you want to use AI to find errors.

You MUST NOT let the AI write the code. Whatever you submit, must be code YOU wrote. AI "vibe coding" is going to bloat and destroy a lot of products before people get told to stop doing it as their first tool in the box instead of the last.

Re: (Score:2)

by AmiMoJo ( 196126 )

They should get it to refactor the codebase. Make it easy to build and adapt, so it can be used as the basis for other browsers like Chromium is. Then document it extensively.

It's the only way to survive. AI isn't going to save them.

Re: (Score:2)

by karmawarrior ( 311177 )

Firefox is already used as the basis of other browsers, some direct clones, some radically different.

And maybe they shouldn't be using an LLM to refactor code. Who exactly is going to maintain this slop? Or do you think Vibe-coded shit is built to last?

There are so many things wrong, on every single level, with what you just said I find it hard to believe I managed to pick just two.

Re: (Score:2)

by gweihir ( 88907 )

Except that this is a meaningless stunt and the actual performance of the LLM is does not even remotely resemble the claims made.

Re: (Score:2)

by karmawarrior ( 311177 )

No, it really isn't.

This is a PR puff piece and almost certainly both over exaggerates the degree to which Claude help and leaves out massive pieces of information suggesting that a lot of work was needed to get it to the point it could actually help.

What's happening in reality is people are using Claude to find "bugs", submitting them to bug bounty programs, and overloading the authors of software like Curl with ridiculous amounts of slop.

It's not that you can't use Claude to find bugs. It's that people wh

Good! (Score:2)

by liqu1d ( 4349325 )

AI is perfect for this sort of thing. This doesn't mean any ahole with a lobsterbot should be submitting to repos though. Perhaps I'm misreading it but they don't state what clarification the other 86 bugs were if not high severity. I wonder if a standard static analyser would have found them too.

Cool AI hype post, too bad reality is here. (Score:3)

by derplord ( 7203610 )

"omfg Claude found a ton of bugs! critical! high! buy our credits so you can find bugs from your own software! STOCK VALUE PUMP!"

Reality:

Success Rate: Claude attempted to write exploit code for these bugs. It produced 2 working exploits.

Real-World Viability: Zero. Anthropic’s own Red Team lead (Logan Graham) admitted these exploits only worked on a "test version" of the browser.

Mostly "assertion failures" (code that doesn't follow its own rules and crashes) and "logic errors" that traditional fuzzing (automated random input testing) had missed.

I wish Slashdot still had editors that actually understood what they're copy pasting.

Re: (Score:2)

by derplord ( 7203610 )

Oh yeah and the sandbox was turned off.

Such stocks! much wow!

Re: (Score:2)

by gweihir ( 88907 )

Indeed. And if the editors would recognize meaningless stunts, that would also be nice.

Re: (Score:3)

by ZipNada ( 10152669 )

Maybe it wasn't great at writing exploit code, but so what? Claude "found more high-severity bugs in Firefox than the rest of the world typically reports in two months, Mozilla said."

Good luck with that (Score:2)

by gweihir ( 88907 )

LLMs are absolute shit at spotting vulnerabilities and at proposing fixes. Yes, they find some things. But the more serious the vulnerabilities, the worse they perform. On CVE level they find almost nothing. And that is the level that counts.

And tomorrow will be like today, only more so.
-- Isaiah 56:12, New Standard Version