News: 0001537902

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Linux 6.15 Perf Tooling Introduces New Support For Latency Profiling

([Linux Kernel] 2 Hours Ago perf record --latency)


The perf tools changes were merged today for the Linux 6.15 kernel. Most notable this cycle for the wonderful perf tooling is introducing the notion of latency profiling by leveraging kernel scheduler information. This latency data will be further useful for Linux software engineers working to optimize system latency/performance.

The [1]perf tools pull request for Linux 6.15 explains of this new "--latency" option for the "perf record" command:

"Introduce latency profiling using scheduler information. The latency profiling is to show impacts on wall-time rather than cpu-time. By tracking context switches, it can weight samples and find which part of the code contributed more to the execution latency.

The value (period) of the sample is weighted by dividing it by the number of parallel execution at the moment. The parallelism is tracked in perf report with sched-switch records. This will reduce the portion that are run in parallel and in turn increase the portion of serial executions.

For now, it's limited to profile processes, IOW system-wide profiling is not supported. You can add --latency option to enable this."

Dmitry Vyukov of Google worked on this latency reporting for perf as well as a new parallelism key. In the prior [2]patch series he further elaborated on this latency profiling focus and purpose:

"There are two notions of time: wall-clock time and CPU time. For a single-threaded program, or a program running on a single-core machine, these notions are the same. However, for a multi-threaded/multi-process program running on a multi-core machine, these notions are significantly different. Each second of wall-clock time we have number-of-cores seconds of CPU time.

Currently perf only allows to profile CPU time. Perf (and all other existing profilers to the be best of my knowledge) does not allow to profile wall-clock time.

Optimizing CPU overhead is useful to improve 'throughput', while optimizing wall-clock overhead is useful to improve 'latency'. These profiles are complementary and are not interchangeable. Examples of where latency profile is needed:

- optimzing build latency

- optimizing server request latency

- optimizing ML training/inference latency

- optimizing running time of any command line program

CPU profile is useless for these use cases at best (if a user understands the difference), or misleading at worst (if a user tries to use a wrong profile for a job).

...

Brief outline of the implementation:

- add context switch collection during record

- calculate number of threads running on CPUs (parallelism level) during report

- divide each sample weight by the parallelism level

This effectively models that we were taking 1 sample per unit of wall-clock time.

We still default to the CPU profile, so it's up to users to learn about the second profiling mode and use it when appropriate."

The code is merged and ready to go with Linux 6.15. [3]This new documentation goes into more detail on the CPU and latency overhead reporting for perf. Can't wait to see what improvements will be uncovered by Google and others leveraging perf record --latency .



[1] https://lore.kernel.org/lkml/20250328063228.3824573-1-namhyung@kernel.org/

[2] https://lore.kernel.org/lkml/b6703af9059c22f0ce280993063a0d395d381b33.1738592865.git.dvyukov@google.com/T/

[3] https://github.com/torvalds/linux/blob/802f0d58d52e8e34e08718479475ccdff0caffa0/tools/perf/Documentation/cpu-and-latency-overheads.txt



yump

After this was written there appeared a remarkable posthumous memoir that
throws some doubt on Millikan's leading role in these experiments. Harvey
Fletcher (1884-1981), who was a graduate student at the University of Chicago,
at Millikan's suggestion worked on the measurement of electronic charge for
his doctoral thesis, and co-authored some of the early papers on this subject
with Millikan. Fletcher left a manuscript with a friend with instructions
that it be published after his death; the manuscript was published in
Physics Today, June 1982, page 43. In it, Fletcher claims that he was the
first to do the experiment with oil drops, was the first to measure charges on
single droplets, and may have been the first to suggest the use of oil.
According to Fletcher, he had expected to be co-authored with Millikan on
the crucial first article announcing the measurement of the electronic
charge, but was talked out of this by Millikan.
-- Steven Weinberg, "The Discovery of Subatomic Particles"

Robert Millikan is generally credited with making the first really
precise measurement of the charge on an electron and was awarded the
Nobel Prize in 1923.