AMD Linux Engineers Introduce New "schedstat" Tool
([AMD] 5 Hours Ago
perf schedstat)
- Reference: 0001462918
- News link: https://www.phoronix.com/news/AMD-Linux-perf-schedstat-Tool
- Source link:
AMD Linux engineers have introduced a new perf tool called "schedstat" that aims to be less resource intensive and convenient than the existing " perf sched " tool for profiling kernel scheduler behavior.
The schedstat tool conveniently reports the time elapsed in jiffies and other CPU scheduling statistics. The intent of the tool is to be lighter-weight and used by developers and others profiling Linux kernel scheduler changes.
AMD engineer Ravi Bangoria explained of the schedstat tool:
"Existing `perf sched` is quite exhaustive and provides lot of insights into scheduler behavior but it quickly becomes impractical to use for long running or scheduler intensive workload. For ex, `perf sched record` has ~7.77% overhead on hackbench (with 25 groups each running 700K loops on a 2-socket 128 Cores 256 Threads 3rd Generation EPYC Server), and it generates huge 56G perf.data for which perf takes ~137 mins to prepare and write it to disk.
Unlike `perf sched record`, which hooks onto set of scheduler tracepoints and generates samples on a tracepoint hit, `perf sched schedstat record` takes snapshot of the /proc/schedstat file before and after the workload, i.e. there is zero interference on workload run. Also, it takes very minimal time to parse /proc/schedstat, convert it into perf samples and save those samples into perf.data file. Result perf.data file is much smaller. So, overall `perf sched schedstat record` is much more light-weight compare to `perf sched record`.
We, internally at AMD, have been using this (a variant of this, known as "sched-scoreboard") and found it to be very useful to analyse impact of any scheduler code changes.
Please note that, this is not a replacement of perf sched record/report. The intended users of the new tool are scheduler developers, not regular users."
Schedstat sounds like a great fit and complementary tool to the common "perf sched" functionality and a nice evolution of their earlier sched-scoreboard tool.
AMD's schedstat tool was presented today on the Linux kernel mailing list as part of [1]this patch series seeking comments from other developers on interest and upstreaming this code into the perf tools within the Linux kernel source tree.
[1] https://lore.kernel.org/lkml/20240508060427.417-1-ravi.bangoria@amd.com/
The schedstat tool conveniently reports the time elapsed in jiffies and other CPU scheduling statistics. The intent of the tool is to be lighter-weight and used by developers and others profiling Linux kernel scheduler changes.
AMD engineer Ravi Bangoria explained of the schedstat tool:
"Existing `perf sched` is quite exhaustive and provides lot of insights into scheduler behavior but it quickly becomes impractical to use for long running or scheduler intensive workload. For ex, `perf sched record` has ~7.77% overhead on hackbench (with 25 groups each running 700K loops on a 2-socket 128 Cores 256 Threads 3rd Generation EPYC Server), and it generates huge 56G perf.data for which perf takes ~137 mins to prepare and write it to disk.
Unlike `perf sched record`, which hooks onto set of scheduler tracepoints and generates samples on a tracepoint hit, `perf sched schedstat record` takes snapshot of the /proc/schedstat file before and after the workload, i.e. there is zero interference on workload run. Also, it takes very minimal time to parse /proc/schedstat, convert it into perf samples and save those samples into perf.data file. Result perf.data file is much smaller. So, overall `perf sched schedstat record` is much more light-weight compare to `perf sched record`.
We, internally at AMD, have been using this (a variant of this, known as "sched-scoreboard") and found it to be very useful to analyse impact of any scheduler code changes.
Please note that, this is not a replacement of perf sched record/report. The intended users of the new tool are scheduler developers, not regular users."
Schedstat sounds like a great fit and complementary tool to the common "perf sched" functionality and a nice evolution of their earlier sched-scoreboard tool.
AMD's schedstat tool was presented today on the Linux kernel mailing list as part of [1]this patch series seeking comments from other developers on interest and upstreaming this code into the perf tools within the Linux kernel source tree.
[1] https://lore.kernel.org/lkml/20240508060427.417-1-ravi.bangoria@amd.com/
rafanelli