Cloudflare Raves About Performance Gains After Rust Rewrite (cloudflare.com)
- Reference: 0179935948
- News link: https://developers.slashdot.org/story/25/11/02/0529258/cloudflare-raves-about-performance-gains-after-rust-rewrite
- Source link: https://blog.cloudflare.com/20-percent-internet-upgrade/
And yes, Rust was involved:
> We write a lot of Rust, and we've gotten pretty good at it... We built FL2 in Rust, on Oxy [Cloudflare's [1]Rust-based next generation proxy framework, and built a strict module framework to structure all the logic in FL2... Built in Rust, [Oxy] eliminates entire classes of bugs that plagued our Nginx/LuaJIT-based FL1, like memory safety issues and data races, while delivering C-level performance. At Cloudflare's scale, those guarantees aren't nice-to-haves, they're essential. Every microsecond saved per request translates into tangible improvements in user experience, and every crash or edge case avoided keeps the Internet running smoothly. Rust's strict compile-time guarantees also pair perfectly with FL2's modular architecture, where we enforce clear contracts between product modules and their inputs and outputs...
>
> It's a big enough distraction from shipping products to customers to rebuild product logic in Rust. Asking all our teams to maintain two versions of their product logic, and reimplement every change a second time until we finished our migration was too much. So, we implemented a layer in our old NGINX and OpenResty based FL which allowed the new modules to be run. Instead of maintaining a parallel implementation, teams could implement their logic in Rust, and replace their old Lua logic with that, without waiting for the full replacement of the old system.
Over 100 engineers worked on FL2 — and there was extensive testing, plus a fallback-to-FL1 procedure. But "We started running customer traffic through FL2 early in 2025, and have been progressively increasing the amount of traffic served throughout the year...."
> As we described at the start of this post, FL2 is substantially faster than FL1. The biggest reason for this is simply that FL2 performs less work [thanks to filters controlling whether modules need to run]... Another huge reason for better performance is that FL2 is a single codebase, implemented in a performance focussed language. In comparison, FL1 was based on NGINX (which is written in C), combined with LuaJIT (Lua, and C interface layers), and also contained plenty of Rust modules. In FL1, we spent a lot of time and memory converting data from the representation needed by one language, to the representation needed by another. As a result, our internal measures show that FL2 uses less than half the CPU of FL1, and much less than half the memory. That's a huge bonus — we can spend the CPU on delivering more and more features for our customers!
>
> Using our own tools and independent benchmarks like CDNPerf, we measured the impact of FL2 as we rolled it out across the network. The results are clear: websites are responding 10 ms faster at the median, a 25% performance boost. FL2 is also more secure by design than FL1. No software system is perfect, but the Rust language brings us huge benefits over LuaJIT. Rust has strong compile-time memory checks and a type system that avoids large classes of errors. Combine that with our rigid module system, and we can make most changes with high confidence...
>
> We have long followed a policy that any unexplained crash of our systems needs to be [2]investigated as a high priority . We won't be relaxing that policy, though the main cause of novel crashes in FL2 so far has been due to hardware failure. The massively reduced rates of such crashes will give us time to do a good job of such investigations. We're spending the rest of 2025 completing the migration from FL1 to FL2, and will turn off FL1 in early 2026. We're already seeing the benefits in terms of customer performance and speed of development, and we're looking forward to giving these to all our customers.
>
> After that, when everything is modular, in Rust and tested and scaled, we can really start to optimize...!
Thanks to long-time Slashdot reader [3]Beeftopia for sharing the article.
[1] https://blog.cloudflare.com/introducing-oxy/
[2] https://blog.cloudflare.com/however-improbable-the-story-of-a-processor-bug/
[3] https://www.slashdot.org/~Beeftopia
Rust's faster than Lua, what a surprise (Score:3)
I mean, come on. Everybody knows that. They could've implemented the Lua parts in C as well, and then compare performance.
Re: (Score:2)
Yes, but that would have been a silly thing to do. Performance is not the only point here.
Re: (Score:2)
> I mean, come on. Everybody knows that. They could've implemented the Lua parts in C as well, and then compare performance.
According to the article they used LuaJIT. I would not be surprised in their use case to get basically equivalent C performance.
They did state the main reason for the better performance: the new implementation has less logic and by using Rust cohesively instead than mixed with C/Lua components there is no need for "translation layers" between languages anymore.
They could have achieved the same by consolidating to C, but of course Rust brings additional important advantages for them.
I wonder if that's why it's been sucking less... (Score:2)
I have noticed it not being quite the latency fiend it used to be. I wonder if it's related or maybe someone just unkinked the Cat5 at the ISP by coincidence.
Re: (Score:3)
Nah, they switched to using [1]these cables [eevblog.com]. That's a real photo, not AI-generated.
[1] https://www.eevblog.com/forum/dodgy-technology/directional-grain-speaker-wires/?action=dlattach;attach=2687741;image
Captchas just got a whole lot faster (Score:2)
and the ubiquitous surveillance of large swathes of the internet as well. Woohoo!
Except when sites are going down (Score:2)
Does "better performance" include inducing major outages or are those conveniently ignored?
Rust is great but... (Score:4, Insightful)
The headline is misleading. Rust has nothing to do with the performance gains. They rewrote an entire system using what they learned from the previous version. Sometimes this is the right thing to do. It's not always easy to predict, but when folks start raving about the rewrite then it was probably a good decision.
Re: Rust is great but... (Score:2)
I agree, most performance issues are due to a misdirected architecture, not language.
Not all architectural performance bottlenecks were a problem initially, but as the system grows they become more and more noticable.
Re: Rust is great but... (Score:2)
People who say this usually haven't tried to write simultaneously multithreaded and concurrent applications in a systems language. Shit, rust makes it easier to do that than even "easy" higher level languages that were specifically designed for it from the beginning, like go.
Re: Rust is great but... (Score:2)
Ok so they didn't know how to make the changes to make it go faster without rust. That just means they are inept, not that rust made it go faster.
Re: (Score:2)
You're that idiot who can't even get python right despite years of experience with it. The only person in the world you have any room to call inept is angelosphere. Any idiot can write a multithreaded application, but having them actually work at scale is a whole other matter. You wouldn't understand the reason for that because, as you already openly admitted to, you rely entirely on fixing bugs only after your ship has already sunk.
Re: Rust is great but... (Score:2)
I just don't go out of my way to be overly pedantic with Python because that takes away the major advantage of Python, that it can be done more quickly than anything else. And maybe you won't believe me, but my stuff works fine and without bugs. This is why I like Python, because it can get straight to where you are going without running in circles.
Re:Rust is great but... (Score:4, Interesting)
> The headline is misleading.
At a minimum, the headline is certainly (and perhaps intentionally?) ambiguous. The "summary" - which probably includes the entire blog post - does make it pretty obvious that rust was not the reason for the speedup, although their choice of rust certainly makes prima facie sense.
I'm a little surprised that replacing a bunch of old disparate software that's basically hacked together with a scripting language ( [1]obligatory xkcd [xkcd.com]) with a new custom compiled job only resulted in a 25% speed-up.
[1] https://xkcd.com/224/
Re: (Score:2)
Came here to say the same thing. There have been several claims of the superiority of language or methodology X which, on closer examination, turned out to be caused by a rewrite of some old cobbled-together mess accumulated over a 20-year period with a new, properly-designed replacement. You could have replaced it with something new written in Visual Basic and seen an improvement.