Red Hat Developing "eu-stacktrace" For Profiling Without Frame Pointers
([Red Hat] 2 Hours Ago
eu-stacktrace)
- Reference: 0001470086
- News link: https://www.phoronix.com/news/Red-Hat-eu-stacktrace
- Source link:
While last year we saw [1]Fedora to no longer omit the frame pointer to help in debugging/profiling Fedora packages and [2]Ubuntu 24.04 LTS also enabled frame pointers for better debugging/profiling , among other distributions, there is the known performance implications of no longer omitting the frame pointer. But now in aiming to make the best of both worlds, it turns out Red Hat has been developing eu-stracktrace as a new means of profiling without relying on frame pointers.
Serhei Makarov with Red Hat announced his work today on system-wide profiles of binaries without frame pointers. The experimental eu-stracktrace tool relies on elfutils toolkit's unwinding libraries to support a sampling profiler for unwinding frame pointer less stack sample data.
Makarov wrote today on the Red Hat blog:
"The prototype version of eu-stacktrace consists of a command line tool implemented in a branch of the elfutils source repository and a patchset for the Sysprof profiler.
...
To give an initial idea of the CPU overhead of eu-stacktrace unwinding compared to Sysprof’s default mode of operation, I used Sysprof with and without eu-stacktrace to profile a system that was running the stress-ng "matrix" benchmark, invoked with stress-ng --matrix 0 -t 30s. On a system that was otherwise lightly loaded, using Sysprof with the default frame pointer profiling resulted in 0.09% of the samples coming from the sysprof-cli profiler process, while profiling with eu-stacktrace resulted in 1.18% of the samples coming from sysprof-cli and eu-stacktrace.
The overhead of the elfutils unwinder scales with the number of distinct processes for which eh_frame data needs to be processed, rather than with the number of samples. After launching several desktop applications and re-running the benchmark, the profiling overhead rose to 1.39% of the total samples.
According to Fedora project discussions around the time frame pointers were being re-enabled in major distributions, slowdown due to frame pointers is reported to fall within the range of 0…2%. More extreme slowdowns have been observed for particular programs such as the Python interpreter, but are not ubiquitous.
It is important to note that, unlike with overhead due to profiling, slowdown due to frame pointers occurs regardless of whether a particular system is being profiled or will ever need to be profiled. Thus, approximately 1% overhead with eu-stacktrace only during profiling is a reasonable tradeoff over 0…2% overhead for frame pointer inclusion on every system, all of the time. The overhead could be further reduced by making eu-stacktrace accessible via a library API rather than a fifo, at the cost of requiring more complex modifications to the profiling tools that use it."
The eu-stracktrace work sounds very interesting for perhaps in the future allowing Linux distributions to go back to omitting frame pointers as a compiler performance optimization. Right now though eu-stracktrace remains experimental, integrate with more profiling tools beyond Sysprof, and other improvements.
Those wanting to learn more can do so via [3]the Red Hat Developer blog . The eu-stracktrace code is currently hosted via [4]the eu-stracktrace branch of elfutils .
[1] https://www.phoronix.com/news/F38-fno-omit-frame-pointer
[2] https://www.phoronix.com/news/Ubuntu-Frame-Pointers-Default
[3] https://developers.redhat.com/articles/2024/06/11/get-system-wide-profiles-binaries-without-frame-pointers#implementation
[4] https://sourceware.org/cgit/elfutils/tree/README.eu-stacktrace?h=users/serhei/eu-stacktrace
Serhei Makarov with Red Hat announced his work today on system-wide profiles of binaries without frame pointers. The experimental eu-stracktrace tool relies on elfutils toolkit's unwinding libraries to support a sampling profiler for unwinding frame pointer less stack sample data.
Makarov wrote today on the Red Hat blog:
"The prototype version of eu-stacktrace consists of a command line tool implemented in a branch of the elfutils source repository and a patchset for the Sysprof profiler.
...
To give an initial idea of the CPU overhead of eu-stacktrace unwinding compared to Sysprof’s default mode of operation, I used Sysprof with and without eu-stacktrace to profile a system that was running the stress-ng "matrix" benchmark, invoked with stress-ng --matrix 0 -t 30s. On a system that was otherwise lightly loaded, using Sysprof with the default frame pointer profiling resulted in 0.09% of the samples coming from the sysprof-cli profiler process, while profiling with eu-stacktrace resulted in 1.18% of the samples coming from sysprof-cli and eu-stacktrace.
The overhead of the elfutils unwinder scales with the number of distinct processes for which eh_frame data needs to be processed, rather than with the number of samples. After launching several desktop applications and re-running the benchmark, the profiling overhead rose to 1.39% of the total samples.
According to Fedora project discussions around the time frame pointers were being re-enabled in major distributions, slowdown due to frame pointers is reported to fall within the range of 0…2%. More extreme slowdowns have been observed for particular programs such as the Python interpreter, but are not ubiquitous.
It is important to note that, unlike with overhead due to profiling, slowdown due to frame pointers occurs regardless of whether a particular system is being profiled or will ever need to be profiled. Thus, approximately 1% overhead with eu-stacktrace only during profiling is a reasonable tradeoff over 0…2% overhead for frame pointer inclusion on every system, all of the time. The overhead could be further reduced by making eu-stacktrace accessible via a library API rather than a fifo, at the cost of requiring more complex modifications to the profiling tools that use it."
The eu-stracktrace work sounds very interesting for perhaps in the future allowing Linux distributions to go back to omitting frame pointers as a compiler performance optimization. Right now though eu-stracktrace remains experimental, integrate with more profiling tools beyond Sysprof, and other improvements.
Those wanting to learn more can do so via [3]the Red Hat Developer blog . The eu-stracktrace code is currently hosted via [4]the eu-stracktrace branch of elfutils .
[1] https://www.phoronix.com/news/F38-fno-omit-frame-pointer
[2] https://www.phoronix.com/news/Ubuntu-Frame-Pointers-Default
[3] https://developers.redhat.com/articles/2024/06/11/get-system-wide-profiles-binaries-without-frame-pointers#implementation
[4] https://sourceware.org/cgit/elfutils/tree/README.eu-stacktrace?h=users/serhei/eu-stacktrace
EphemeralEft