News: 1741543688

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

eBPF. It doesn't stand for anything. But it might mean bank

(2025/03/09)


Meta says it has managed to reduce the CPU cycles of its top services by 20 percent through its Strobelight profiling orchestration suite, which relies on the open source eBPF project.

Fans of the mass audience monetization biz may be delighted to learn that this translates to a 10 to 20 percent reduction in the number of servers required for Facebooking, Instagramming, WhatApping, and whatever it is that people do with headsets and avatars.

eBPF doesn't stand for anything anymore. It used to be an acronym for extended Berkeley Packet Filter but its remit has expanded to the point that the alphabet jumble is no longer moored to a limiting identity.

[1]

The open source software, which has its own [2]foundation , provides a way to run sandboxed programs within the operating system kernel, for Linux and, as [3]a work-in-progress , for Windows. The idea being to run software relatively safely within a privileged kernel context without having to build and insert kernel modules, package the software as a driver, or recompile the kernel to include the desired functionality.

[4]

[5]

Running within the kernel is useful for service optimization, particularly at scale where small bottlenecks and inefficiencies can be amplified to great detriment. Collecting data across a diverse set of systems without degrading performance, such that the data is consistent and interpretable across multiple kernel versions, is not a trivial challenge.

Meta developed open source [6]Strobelight , which orchestrates a variety of profiling applications that utilize eBPF, to collect observability data – logs of system events, metrics that measure performance, traces of network connections. Its goal was to make its infrastructure more efficient, which reduces expenses and has operational advantages.

[7]

"eBPF allows the safe injection of custom code into the kernel, which enables very low overhead collection of different types of data and unlocks so many possibilities in the observability space that it’s hard to imagine how Strobelight would work without it," [8]said Meta software engineer Jordan Rome in January.

[9]Someone is slipping a eBPF backdoor into Juniper routers, activated by a magic packet

[10]CrowdStrike's Blue Screen blunder: Could eBPF have saved the day?

[11]Linux kernel's eBPF feature put to unexpected new uses

[12]Anatomy of suspected top-tier NSA BPF backdoor

Strobelight presently consists of 42 different profiling applications, [13]a number of arguable significance . These profilers measure memory, function call counts, events in various programming languages, AI GPU usage, service request latency, and soon.

As noted in the eBPF Foundation's recent [14]case study of Meta's stupendous server savings, 15,000 servers' worth of annual capacity were saved with a single one-character code change.

It was an ampersand (&). But it will be evaluated by Meta bean counters as a dollar sign.

According to Rome, "A seasoned performance engineer was looking through Strobelight data and discovered that by filtering on a particular std::vector function call (using the symbolized file and line number) he could identify computationally expensive array copies that happen unintentionally with the 'auto' keyword in C++."

[15]

After finding one of these costly array copies in the path of one of Meta's major ad services, the engineer determined that the vector copy wasn't intentional. So he added an "&" after the auto keyword [16]to turn the copy into a reference , which avoids unnecessary data duplication by pointing to the data rather than reproducing it.

"It was a one-character commit, which, after it was shipped to production, equated to an estimated 15,000 servers in capacity savings per year," said Rome.

One can only imagine the savings to be had from applying the delete character. ®

Get our [17]Tech Resources



[1] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/oses&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2Z84di_9jyF4FcyWCI7WBOgAAAFY&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0

[2] https://ebpf.foundation/

[3] https://microsoft.github.io/ebpf-for-windows/

[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/oses&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44Z84di_9jyF4FcyWCI7WBOgAAAFY&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[5] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/oses&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33Z84di_9jyF4FcyWCI7WBOgAAAFY&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[6] https://github.com/facebookincubator/strobelight

[7] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/oses&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44Z84di_9jyF4FcyWCI7WBOgAAAFY&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[8] https://engineering.fb.com/2025/01/21/production-engineering/strobelight-a-profiling-service-built-on-open-source-technology/

[9] https://www.theregister.com/2025/01/25/mysterious_backdoor_juniper_routers/

[10] https://www.theregister.com/2024/09/26/grafana_labs_interview/

[11] https://www.theregister.com/2022/09/14/linux_ebpf/

[12] https://www.theregister.com/2022/02/23/chinese_nsa_linux/

[13] https://news.mit.edu/2019/answer-life-universe-and-everything-sum-three-cubes-mathematics-0910

[14] https://ebpf.foundation/case-study-metas-strobelight-leverages-ebpf-to-reduce-cpu-cycles-and-server-demands-by-up-to-20/

[15] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/oses&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33Z84di_9jyF4FcyWCI7WBOgAAAFY&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[16] https://blog.petrzemek.net/2017/12/08/when-auto-seemingly-deduces-a-reference-in-cpp/

[17] https://whitepapers.theregister.com/



Cost Savings

An_Old_Dog

... and then Meta passed some of the savings on to its customers and to the employee which actually did the work.

Meta Employee A: "Why is there a doughnut on my desk?"

Meta Employee B: "It's a symbol of employee appreciation from the Big Boss. Also, there's Post-It note on your monitor."

Employee A (reading the note): "I wanted to take a moment to personally thank you for the work you did which will save this company millions of dollars each year. Keep up the great work! That's an order.

Also: to streamline our operations, your position has been cut as of the end of this month.

--Z."

Re: Cost Savings

Andy 68

Oh come on, they're not that bad.

He'd at least get a waffle party, surely?

Re: Cost Savings

Anonymous Coward

Devil's Advocate:

Do the golden geese keep laying eggs? Or are these events usually the result of a one-trick pony? Do developers that identify massive saves like this typically go on to deliver massive saves in the future, or do they generally "get lucky" that one time? Do other massive saves come from other employees profiling different code? Does this developer get moved to a different project to massive-save again? Was this just a qwirk of the time/position?

How much of this was profiling (and who suggested/justified/ordered the profiling?) as opposed to software development prowess?

Meta, Microsoft, Amazon, Google et al. probably have the statistics on these sorts of things, but I've never seen it discussed generally.

All of this ignores the simple fact: who missed the optimization potential at the beginning, and how much pressure were they under to "get it done, not get it done well"?

Remarkable stuff!

HuBo

Yeah, Roger Booth had that as a C++ Gotcha on [1]Medium last year , with " 'auto' and the & " examplified by (the 1-char difference):

auto first_big = get_first(big_vec); // does possibly needless copy

auto& first_big = get_first(big_vec); // doesn't do needless copy

He suggested that C++17's [2]copy elision might help too.

Good thing though that Meta's " seasoned performance engineer " protagonist was profiling to the eBPF Strobelight, rather than, say, checking out chicks/dudes while doing [3]The Safety Dance , pogo, or suchlikes!

P.S. cool link under: " a number of arguable significance "

[1] https://medium.com/@rogerbooth/c-gotcha-unnecessary-copies-due-to-the-misuse-of-auto-ed24e65b5efd

[2] https://en.cppreference.com/w/cpp/language/copy_elision

[3] https://en.wikipedia.org/wiki/The_Safety_Dance

Ampersand

Anonymous Coward

So, that'll be making a function take a ref instead of passing the a copy of the object.

Yup, know that one. The MFC code that came with the Microsoft VC6 compiler seemed to be written by someone totally unaware of passing by ref - let alone a const ref. Programs that did a lot of work with for example time series data, passing around CTime instances, could be spend up ten or a hundredfold just by editing the MFC sources, putting in lots and lots of ampersands and consts and recompiling it. That drove the point home.

In later years, away from MFC, could still always guarantee to save time & memory by doing the same to code written for us, to save copying some complicated type built out of Boost/STL where they did not know the size of the structures and cost of copying. Worse, never profiled to find out. And you don't need to resort to eBPF to profile most programs.

And, no, this does not prove that C++ is bad, many languages provide both call-by-name (ref or const ref) and call-by-value (make a copy you can modify without affecting the caller) and the same mistakes are possible in all of them.

"Life would be much simpler and things would get done much faster if it
weren't for other people"
-- Blore