New Proposal To Raise The Linux Kernel's Default Timer Frequency To 1000Hz
([Linux Kernel] 5 Hours Ago
250Hz To 1000Hz)
- Reference: 0001525488
- News link: https://www.phoronix.com/news/Linux-2025-Proposal-1000Hz
- Source link:
A patch sent out on Sunday by Google engineer Qais Yousef is proposing to raise the Linux kernel's default timer frequency from 250Hz to 1000Hz.
The Google engineer argues that the current Linux kernel default can lead to issues with scheduler decisions such as imprecise time slices, delayed load balance, delayed stats updates, and other related complications. Qais Yousef argues the kernel would be better off going for a 1000Hz default:
"On Android and Desktop systems etc 120Hz is a common screen configuration. This gives tasks 8ms deadline to do their work. 4ms is half this time which makes the burden on making very correct decision at wake up stressed more than necessary. And it makes utilizing the system effectively to maintain best perf/watt harder. As an example tries to fix our definition of DVFS headroom to be a function of TICK as it defines our worst case scenario of updating stats. The larger TICK means we have to be overly aggressive in going into higher frequencies if we want to ensure perf is not impacted. But if the task didn't consume all of its slice, we lost an opportunity to use a lower frequency and save power. Lower TICK value allows us to be smarter about our resource allocation to balance perf and power.
Generally workloads working with ever smaller deadlines is not unique to UI pipeline. Everything is expected to finish work sooner and be more responsive.
I believe HZ_250 was the default as a trade-off for battery power devices that might not be happy with frequent TICKS potentially draining the battery unnecessarily. But to my understanding the current state of NOHZ should be good enough to alleviate these concerns. And recent addition of RCU_LAZY further helps with keeping TICK quite in idle scenarios.
As pointed out to me by Saravana though, the longer TICK did indirectly help with timer coalescing which means it could hide issues with drivers/tasks asking for frequent timers preventing entry to deeper idle states (4ms is a high value to allow entry to deeper idle state for many systems). But one can argue this is a problem with these drivers/tasks. And if the coalescing behavior is desired we can make it intentional rather than accidental.
The faster TICK might still result in higher power, but not due to TICK activities. The system is more responsive (as intended) and it is expected the residencies in higher freqs would be higher as they were accidentally being stuck at lower freqs. The series in [1] attempts to improve scheduler handling of responsiveness and give users/apps a way to better provide their needs, including opting out of getting adequate response (rampup_multiplier being 0 in the mentioned series)."
The Linux kernel timer frequency has long been a source of debate and different opinions. It does seem logical though that by now the kernel would default to 1000Hz over 250Hz.
[1]The patch changing the default is now up for review/discussion.
[1] https://lore.kernel.org/lkml/20250210001915.123424-1-qyousef@layalina.io/
The Google engineer argues that the current Linux kernel default can lead to issues with scheduler decisions such as imprecise time slices, delayed load balance, delayed stats updates, and other related complications. Qais Yousef argues the kernel would be better off going for a 1000Hz default:
"On Android and Desktop systems etc 120Hz is a common screen configuration. This gives tasks 8ms deadline to do their work. 4ms is half this time which makes the burden on making very correct decision at wake up stressed more than necessary. And it makes utilizing the system effectively to maintain best perf/watt harder. As an example tries to fix our definition of DVFS headroom to be a function of TICK as it defines our worst case scenario of updating stats. The larger TICK means we have to be overly aggressive in going into higher frequencies if we want to ensure perf is not impacted. But if the task didn't consume all of its slice, we lost an opportunity to use a lower frequency and save power. Lower TICK value allows us to be smarter about our resource allocation to balance perf and power.
Generally workloads working with ever smaller deadlines is not unique to UI pipeline. Everything is expected to finish work sooner and be more responsive.
I believe HZ_250 was the default as a trade-off for battery power devices that might not be happy with frequent TICKS potentially draining the battery unnecessarily. But to my understanding the current state of NOHZ should be good enough to alleviate these concerns. And recent addition of RCU_LAZY further helps with keeping TICK quite in idle scenarios.
As pointed out to me by Saravana though, the longer TICK did indirectly help with timer coalescing which means it could hide issues with drivers/tasks asking for frequent timers preventing entry to deeper idle states (4ms is a high value to allow entry to deeper idle state for many systems). But one can argue this is a problem with these drivers/tasks. And if the coalescing behavior is desired we can make it intentional rather than accidental.
The faster TICK might still result in higher power, but not due to TICK activities. The system is more responsive (as intended) and it is expected the residencies in higher freqs would be higher as they were accidentally being stuck at lower freqs. The series in [1] attempts to improve scheduler handling of responsiveness and give users/apps a way to better provide their needs, including opting out of getting adequate response (rampup_multiplier being 0 in the mentioned series)."
The Linux kernel timer frequency has long been a source of debate and different opinions. It does seem logical though that by now the kernel would default to 1000Hz over 250Hz.
[1]The patch changing the default is now up for review/discussion.
[1] https://lore.kernel.org/lkml/20250210001915.123424-1-qyousef@layalina.io/
zcansi