News: 0180762080

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Sixteen AI Agents Built a C Compiler From Scratch (arstechnica.com)

(Monday February 09, 2026 @05:45PM (msmash) from the for-what-it's-worth dept.)


Anthropic researcher Nicholas Carlini set 16 instances of Claude Opus 4.6 loose on a shared codebase over two weeks to build a C compiler from scratch, and the AI agents [1]produced a 100,000-line Rust-based compiler capable of building a bootable Linux 6.9 kernel on x86, ARM and RISC-V architectures.

The project ran through nearly 2,000 Claude Code sessions and cost about $20,000 in API fees. Each instance operated inside its own Docker container, independently claiming tasks via lock files and pushing completed code to a shared Git repository. No orchestration agent directed traffic. The compiler achieved a 99% pass rate on the GCC torture test suite and can compile major open source projects including PostgreSQL, SQLite, Redis, FFmpeg and Doom. But it lacks a 16-bit x86 backend and calls out to GCC for that step, its assembler and linker remain buggy, and it produces less efficient code than GCC running with all optimizations disabled.

Carlini also invested significant effort building test harnesses and feedback systems to keep the agents productive, and the model hit a practical ceiling at around 100,000 lines as bug fixes and new features frequently broke existing functionality.



[1] https://arstechnica.com/ai/2026/02/sixteen-claude-ai-agents-working-together-created-a-new-c-compiler/



why multiple agents? (Score:2)

by JustNiz ( 692889 )

Wouldn't it give better results if just one did it?

Re: (Score:3)

by Sique ( 173459 )

Because it would have taken 16 times the time, 32 weeks instead of 2 weeks.

Re: (Score:2)

by FalcDot ( 1224920 )

First of all, why would it give better results? Agents do not have the capacity to understand how one part of the code interacts with another.

Also, I'm guessing it would have taken 32 weeks with one single agent. Probably not interesting to anyone willing to foot the bill.

Re: (Score:3)

by Gilgaron ( 575091 )

The longer they run the more they tend to go off the rails (kinda like going rampant in sci fi i guess?) so the latest stuff has involved using agents to split out tasks to other agents and QC the results and strap everything together. But that's not too surprising since it is how humans get complicated things done, too, versus getting one guy to do the whole damn thing at once.

Re: (Score:2)

by OrangeTide ( 124937 )

So a bit like a large team of developers. We go off the rails too without supervision. Kind of why you need to collect status updates and maintain a schedule and project plan and architectural documents.

Successful open source projects work because the ones who succeed have the discipline to do the coordination and supervision in a more cooperative and distributed way. Top-down programming projects are a hot mess, like if you do Waterfall management without actually writing a plan and sticking to it.

Agile ta

Re: (Score:2)

by AcidFnTonic ( 791034 )

Sorry but the one guy model works amazingly well when all the variables align to positive values.

Kinda like the optimism vs realism debate of a single king vs democratic rule which fails to account for how great it could be if that single king was good and just and instead the debate always favors padding the luck into accepted mediocrity.

There are many projects where a single person is essentially the lifeblood and if they vanish it has little hope left. Many of them from what I have seen are corporate clo

Re: (Score:2)

by Waffle Iron ( 339739 )

In just the Linux example, it is most certainly not the "one guy model". There have been countless thousands of contributors.

Linus is great and all, but he's currently just doing coordination between the members of a very large hierarchical team. No one human could possibly do all of that work.

Re: (Score:2)

by Gilgaron ( 575091 )

One guy might be more critical than the others (e.g. the human monitoring the agents in this case) but certainly Carmack could not have gotten DOOM into your hands entirely single handedly, either, there'd have been lawyers, and accountants, and other more interchangeable folks that were nonetheless necessary.

Re: (Score:2)

by Junta ( 36770 )

These models aren't parallel in their nature, so spinning up multiple 'agents' is just how you manage concurrency. Just like forking a process. I think it's a bit werd to call each instance a distinct 'agent', it is trying too hard to humanize this very synthetic thing...

Re: (Score:1)

by unfriendlyLLM ( 10459763 )

Early attempts to describe "what the web is" or "how the web works" settled on something of a paralleling/concurrency/difference engine. I'm not understanding how current end user LLM'S lend any capabilities to web end user functions. Less processing power was likely used by each instance than its takes to play a video game to give an example.....

Re: (Score:2)

by dskoll ( 99328 )

For architectures that lack working hardware, wouldn't it make more sense to take an existing front-end like gcc or clang and just generate a back-end for the targeted hardware?

Re: (Score:2)

by Junta ( 36770 )

Who is 'we'? I don't think the LLM is up to the task of making a C compiler that can target an architecture with at most 256k of RAM and without a reference C compiler to work with, like it did for this. The LLM basically got told to write a knock off of gcc, and based on some of what happened, it absolutely needed a working gcc to work from to create the sort-of knock off.

Re: (Score:2)

by Brain-Fu ( 1274756 )

When will 16 AI agents be able to code me up a Word processor with features equivalent to Microsoft Word?

Because once they can do that, people can stop buying Office and just vibe up their own versions. So long as the agents can implement standard file formats, the differences in implementations won't matter.

An interesting future is being teased here; one in which the only tech giants remaining will be the makers of AI, and everyone else will just vibe up all the software they now pay through the nose to g

Re: (Score:2)

by ArchieBunker ( 132337 )

The functionality you describe used to fit on a single floppy.

Re: (Score:2)

by Junta ( 36770 )

Sure, they carefully spend weeks crafting the test cases with test data and then spend tens of thousands and probably have to buy it a license of Word to use as a reference to compare test execution on the LLM output versus reference implementation...

Or they could just stop at having bought Word....

The problem here is that this example leaned *heavily* on the software desired already existing and the LLM having access to run the original software as it endeavored to make a knock-off. And then per analysis a

Re: (Score:2)

by Junta ( 36770 )

I mean, what next iteration?

I would also say that this seems the opposite of useful for no longer working hardware. If the hardware existed, then we still have a C compiler for it. If you say you want to modernize that compiler, but the hardware is a dead platform, why are you trying to modernize for it anyway?

Let me extrapolate to the point that maybe you are talking about a compiler for some future architecture. Problem is this example needed:

- A human to carefully craft test cases and rescue the LLM w

Re: (Score:1)

by SumDog ( 466607 )

This is why we can't have affordable RAM.

Re: (Score:3)

by OrangeTide ( 124937 )

It's a neat demonstration. I think if I had $20k to throw around, I'd do the opposite. Make a new 16-bit C compiler to support UZI or [1]Fuzix [fuzix.org]. (maybe not, might be more fun to write myself)

I've seen some decent results for retro programming, such as this vid [2] Let's Create a Commodore C64 BASIC Interpreter in VSCode! [youtube.com]. Where the presenter gets a Commodore/Microsoft BASIC in C, and not only that with some hand holding gets it to output something capable of working on 2.11BSD for PDP-11.

[1] https://www.fuzix.org/

[2] https://www.youtube.com/watch?v=PyUuLYJLhUA

Re: (Score:2)

by DamnOregonian ( 963763 )

As an owner of the Dragon Book, and who wrote a mostly-C compiler for my PDP-11/35 (BSD?! hah!)

Writing a C compiler isn't what I would call fun. Debugging generated code and then debugging why you generated that broken generated code is a whole different circle of hell.

Re: (Score:2)

by OrangeTide ( 124937 )

It was pretty fun writing a Pascal for the pcode machine back in the day. And fun to make a couple of pcode to assembler translators, which shows the rather tricky problem of optimizing register allocation in a stack machine, but fun.

Porting and updating existing compilers like PCC and the Plan9 C compilers is also not too bad, rather enjoyable really. Because much of the hair pulling and crying is over. But these days I just patch LLVM, which is so huge and complex that I'm unlikely to understand more than

Re: (Score:2)

by DamnOregonian ( 963763 )

> It was pretty fun writing a Pascal for the pcode machine back in the day. And fun to make a couple of pcode to assembler translators, which shows the rather tricky problem of optimizing register allocation in a stack machine, but fun.

You're cut from a different cloth than me, that's for sure.

> Porting and updating existing compilers like PCC and the Plan9 C compilers is also not too bad, rather enjoyable really. Because much of the hair pulling and crying is over. But these days I just patch LLVM, which is so huge and complex that I'm unlikely to understand more than the tiny little piece I need for my job.

PCC is a fantastic learning tool for anyone interested in writing a compiler. I wish it had been available to me at the time.

LLVM.... I don't even want to think about.

It's been a decade and a half since I had to deal with anything on the compiler side of things. These days, I'm just happy if the always-in-flux kernel API doesn't break a module I have to maintain.

Re: (Score:2)

by 0123456 ( 636235 )

Don't forget that a huge amount of work for a competent compiler is performance enhancements for different CPUs. I'm guessing it probably doesn't have those.

I know a guy that did that over a weekend (Score:2)

by ebunga ( 95613 )

And all he needed were a couple of pizzas.

Re: (Score:2)

by GoTeam ( 5042081 )

> And all he needed were a couple of pizzas.

Cocaine was involved. Almost a 100% guarantee.

Re: (Score:2)

by ebunga ( 95613 )

Nah, he was just bored and didn't have anything better to do.

Were they all working the code (Score:1)

by jennatalia ( 2684459 )

or did a large number of them work on it with a few managers to keep things in line?

Re: (Score:2)

by SirSlud ( 67381 )

No orchestration agent directed traffic.

Cannot not even build hello world (Score:1)

by ultranerdz ( 1718606 )

[1]https://github.com/anthropics/... [github.com]

[1] https://github.com/anthropics/claudes-c-compiler/issues/1

Re: (Score:1)

by SumDog ( 466607 )

If you read through the issues, it looks like he was missing the include directory.

Doesn't sound like "from scratch" to me! (Score:5, Insightful)

by crgrace ( 220738 )

From TFA:

> When all 16 agents got stuck trying to fix the same Linux kernel bug simultaneously, he used GCC as a reference oracle, randomly compiling most kernel files with GCC and only a subset with Claudeâ(TM)s compiler, so each agent could work on different bugs in different files.

Not only was Claude trained on a lot of different C compilers, including the entirety of GCC, it still needed a golden model in order to finish.

If I claimed I had written my own C compiler from scratch and then this came out in an interview, I don't think I would be hired.

Re: (Score:3)

by 93 Escort Wagon ( 326346 )

Yeah, perhaps an extra step for the analysis would be to run comm against this "from scratch" compiler versus gcc and clang.

Re: (Score:1)

by crgrace ( 220738 )

What strawman? I was quoting the article.

Re: (Score:2, Troll)

by Zero__Kelvin ( 151819 )

"If I claimed I had written my own C compiler from scratch and then this came out in an interview, I don't think I would be hired." At no time did the researchers claim that "from scratch" meant that it wasn't acceptable to use any reference materials or implementations as examples. The compiler was implemented in Rust and GCC is primarily C++, so yes it was implemented "from scratch" in Rust. Of course, you need to try claim that the use of said phrase is the important point and argue against that to try

Re: (Score:2)

by Junta ( 36770 )

One thing that LLMs are pretty good at is translation, and training on a large corpus of Rust and C code as well as numerous C compilers means that this task is utterly useless given the context of the existing compilers, and trying to prove it can make projects "from scratch" is disingenuous when you are just having it implement very well trodden territory.

Hell, someone posted an experiment in writing a C compiler in rust to github at least once before, so it even had a C compiler written in rust that was

Re: (Score:2)

by Zero__Kelvin ( 151819 )

Nothing you wrote was original. Every one of those words is available on the internet. Therefore, by your argument, you didn't write that post "from scratch." Your post is dangerous, because some business leaders might conclude that you came up with that post on your own and think that you are capable of critical thinking, when it was clearly "a trick" designed to make you sound intelligent. Also, you clearly did zero research on what was done and how it was done, and made absolutely no attempt to have a

Re: (Score:2)

by thegarbz ( 1787294 )

The beginning is a variable that isn't definitive. We can make this example as absurd as you want. Does from scratch preclude the ability of using any existing libraries? Does it mean we need to do everything in assembly? Does it mean we need to first create the universe?

Doing something from scratch does not typically preclude the use of reference material. That said in this case the issue seems that it was more than just reference.

Re: (Score:2)

by Zero__Kelvin ( 151819 )

Have you ever made a cake from scratch? This is what happens when you rely on Google for all of your knowledge. People make things from scratch and follow recipes to do it all the time. They also reference pictures in recipe books, etc. You don't have to grind your own flour to make a cake from scratch, nor do you have to remember the process without referring to implementations or design your own recipe. Again, arguing about the term is intended to take a away from the discussion. It is a red herring use

Re: (Score:3)

by Waffle Iron ( 339739 )

> Not only was Claude trained on a lot of different C compilers, including the entirety of GCC, it still needed a golden model in order to finish.

To be fair, they were going use the ANSI C23 specification as a basis, but nobody wanted to shell out the money to buy an official copy.

Yep, nothing llms do is from scratch (Score:1)

by rsilvergun ( 571051 )

They are really really super advanced search engines. They are only as good as the data set they are stealing from. It is hardly a surprise that an llm could cobble together a c compiler.

Re: (Score:2)

by TheDarkMaster ( 1292526 )

In fact the agents simply copied fragments of C compilers from the internet until they ended up with something that (barely) works. I would consider it noteworthy if they only had the specification of what the compiler would have to do and no ready-made compilers to use as examples.

Ok? (Score:3, Insightful)

by Anonymous Coward

I mean, this is impressive in the same way that turning vinyl gloves into hot sauce is impressive. Fabrice Bellard wrote TCC in a few months and it was under 30k lines of code, including an assembler and linker, and it actually worked (the Claude c compiler apparently cannot compile Hello World, so it undoubtedly has hundreds of bugs) - I am a little terrified of what that 100k lines of code actually looks like. If my past experience with coding AI is any guide, it will be disorganized, needlessly wordy, not standards-compliant, has weird unnecessary bugs around Leap Year, lacking in useful comments etc.

Which is to say while this is impressive, I am far more worried about what this means for the future of software quality than I am impressed with this achievement. Look at the crazy number of bugs Windows has had recently, coincidentally after MSFT bragged about how much code was being written by AI these days.

Yeah nah (Score:2, Insightful)

by Willeh ( 768540 )

Plagiarized*

A C compiler written in Rush (Score:4, Informative)

by Valgrus Thunderaxe ( 8769977 )

Please tell me why I shouldn't just commit suicide, right now?

Re: (Score:3)

by GoTeam ( 5042081 )

> Please tell me why I shouldn't just commit suicide, right now?

It's a lot of work, just let time do the hard part.

Re: (Score:2)

by noshellswill ( 598066 )

The *.ai constructed C-compiler demonstrated mediocre performance. Nobody would pay a nickle for it. You have a life-time of hard work ahead of you, cleaning up the mess that LLMs have generated. That granted, current computer pervs are not 1/2 the issue. Excepting medicine, people hugely over-rate the industrial revolution and all its toxic follow-ons.

Re: A C compiler written in Rush (Score:4, Informative)

by PCM2 ( 4486 )

Because it has sick drumming, dude!

AI generated C compiler - not so great (Score:3, Insightful)

by ameline ( 771895 )

Apparently the performance of the code it produces is worse than gcc with all optimizations turned off. That's pathetic. And it is written in rust, so it can't bootstrap - And the size of it is on par with a hand-written C compiler - so no win there.

When I worked on compilers at IBM, our first bootstrap of C for x86 was a cause for celebration. A double bootstrap was a great "smoke test". But that was relatively easy. Passing the validation test suites were *way* harder - like 50x harder.

Re: (Score:3)

by Pascoea ( 968200 )

Being fast, being able to bootstrap, and a size limitation weren't the goals. The goal was "can it do it", which it appears the answer is "yeah, kinda". Vanishingly little of my hobby work output would pass muster against something professionally produced, but that's not the point. I just wanted to do it myself. That seems to mirror this project. I assume they picked GCC because they wanted a sufficiently complicated project that had a well-defined borders. The experiment could have picked anything, li

Re: (Score:2)

by thegarbz ( 1787294 )

> so no win there.

I mean they did it in a short time for $20k. Performance here doesn't just mean execution speed. You're comparing it against a 39 year old project that has millions of hours contributed to it. The point of this wasn't to create a high performance GCC alternative, it was to demonstrate functionality and speed.

I doubt AI coding assistants will ever produce something as performant as experts who have been optimising shit for over a decade do, but some people don't have a decade to spare.

Fun quote.. (Score:2, Insightful)

by Junta ( 36770 )

> . But that total is a fraction of what it would cost me to produce this myself—let alone an entire team.

I'm willing to commit to provide a C compiler in a single day for a tenth of the cost:

# dnf install gcc

For a bonus, I'll even do two:

# dnf install clang

Don't know what I'll do with the other 7.99 hours of the day though...

This reminds me of how Khaby Lame mocked all those overcomplicated "life hacks" by doing the obvious simple things.

Big Deal (Score:1)

by pngwen ( 72492 )

It took just one Dennis Ritchie to do that back in the 70s, and he used much less water in the process!

Source code (Score:2)

by phantomfive ( 622387 )

The source code is available online if you want to look at it: [1]https://github.com/anthropics/... [github.com]

[1] https://github.com/anthropics/claudes-c-compiler

Hear about the Californian terrorist that tried to blow up a bus?
Burned his lips on the exhaust pipe.