It's a myth that you need Mythos to find bugs: Open source models can do it just as well

(2026/04/24)

Reference: 1777030914
News link: https://www.theregister.co.uk/2026/04/24/ai_bugfinding_futures/
Source link:

Black Hat Asia Open source models can find bugs as effectively as Anthropic's Mythos, according to Ari Herbert-Voss, CEO of AI-powered security startup RunSybil and OpenAI's first security hire.

Speaking at the Black Hat Asia conference in Singapore today, Herbert-Voss said Mythos excels at finding both "shallow" bugs - well-described flaws that are and easy to validate - and more complex vulnerabilities.

In his talk, he attributed this to "supralinear scaling": where researchers assumed LLM capability would improve linearly, evidence now suggests a model trained on twice the data, compute, and time produces something four times more capable.

[1]

He hinted supralinear scaling might produce even better multipliers but could not say more due to a non-disclosure agreement.

[2]

[3]

Anthropic has kept access to Mythos tghtly restricted, citing fears of misuse.

However Herbert-Voss argues attackers and defenders alike can achieve comparable results with open source models by building "scaffolding" to run several of them in harness. That approach also improves defense in depth, as different models tend to catch different flaws — a useful hedge against any single model's blind spots.

[4]

Cost is another driver. Mythos is expensive to build and run, and may never be publicly available, making open source alternatives not just viable but necessary for many organizations.

[5]Weak security means attackers could disable all of a city's public EV chargers

[6]EV charger biz ELECQ zapped by ransomware crooks, customer contact data stolen

[7]Hybrid clouds have two attack surfaces and you’re not paying enough attention to either

[8]Mythos found 271 Firefox flaws – but none a human couldn’t spot

Herbert-Voss feels human expertise is still needed to orchestrate open source models so they together deliver Mythos-grade performance, and to assess the bug reports AI generates.

He then noted that [9]fuzzing , the testing technique which injects random or near-random data into software to see if doing so produces bugs, also creates so many warnings that it can make extra work for humans.

AI bug-hunters already produce the same problem, and he expects it will persist.

Herbert-Voss therefore thinks infosec workers will have plenty on their plates for the foreseeable future, and the economic incentive to use AI – someone's got to use services that pay for all those GPUs and datacenters – will act as a forcing function that makes infosec teams adopt AI and as a result improve their proactive and defensive work. ®

Get our [10]Tech Resources

[1] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_security/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2aeuTrYudaw8Nou0yH28NQQAAAsU&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0

[2] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_security/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aeuTrYudaw8Nou0yH28NQQAAAsU&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[3] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_security/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aeuTrYudaw8Nou0yH28NQQAAAsU&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_security/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aeuTrYudaw8Nou0yH28NQQAAAsU&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[5] https://www.theregister.com/2026/04/24/rentable_iot_security_flaws/

[6] https://www.theregister.com/2026/03/09/ransomware_crooks_hit_ev_charger/

[7] https://www.theregister.com/2026/04/23/wac_flaws_hybrid_cloud_security/

[8] https://www.theregister.com/2026/04/22/mozilla_firefox_mythos_future_defenders/

[9] https://www.theregister.com/2022/09/08/google_fuzz_rewards/

[10] https://whitepapers.theregister.com/

MonkeyJuice

"evidence now suggests a model trained on twice the data, compute, and time produces something four times more capable." [Citation Needed].

An awful lot of ink has been spilt and yet very little has been said. Must be an LLM startup.

Mostly agree

Anonymous Coward

I like Ari (Ariel) Herbert-Voss' stuff, like her DEFCON 27 (2019) talk [1]Don’t Red Team AI Like a Chump which has a cool bit on fooling AI surveillance at 14:50, reminiscent of [2]this 2019 ElReg piece . But I'm not quite sure about this " supralinear scaling " of capabilities she hinted at ... and like many I'm rather worried about the cantilevered scaffoldings of antigravity girdle harnesses she speaks of here (openClaws of the RotM).

The AI (so-called) ' scaling laws ' look to have ended way faster than Moore's in LLM space and I have to think the same sort of leveling plateau is to emerge within coding space as well -- rather than a singularity of wishful thinking that hopes for cognition to come out of looping ever longer conversational interactions between talkative middlething agents .

But yeah, we've seen since [3]Mohan that if CVEs are available, an Opus and $3,000 can get you some ways to putting together an exploit chain longer than 2 (which looks to have been a limit even for straight-up coding with the likes of Codex-12B in 2021, eg. [4]Figure 11 in ' Evaluating Large Language Models Trained on Code ', on which she's co-author) ...

[1] https://cyber.harvard.edu/story/2019-11/dont-red-team-ai-chump

[2] https://www.theregister.com/2019/11/04/tshirt_ai_cameras/

[3] https://www.theregister.com/2026/04/17/claude_opus_wrote_chrome_exploit/

[4] https://arxiv.org/abs/2107.03374

News: 1777030914

It's a myth that you need Mythos to find bugs: Open source models can do it just as well

Mostly agree