Linux 6.15 Shipped With A Nasty Power Regression For Some Systems
([Linux Kernel] 4 Hours Ago
Non-SMT)
- Reference: 0001550472
- News link: https://www.phoronix.com/news/Linux-6.15-nosmt-Power-Regress
- Source link:
The Linux 6.15 kernel that shipped as stable last week mistakenly shipped with a nasty CPU power regression for some systems. The issue is now fixed in Linux 6.16 Git and will be fixed shortly in the Linux 6.15 point releases.
In addition to some [1]earlier performance regressions in Linux 6.15 that were [2]ultimately fixed in time for the stable release, a late regression for Linux 6.15 ended up yielding significant CPU power regressions. This is present on Linux v6.15 but should be fixed in time for the next point release to Linux 6.15.
Intel engineer and power management subsystem maintainer Rafael Wysocki is also the one that tackled this regression. He explained in [3]the revert that landed in Linux 6.16 Git at the end of the week:
"Revert commit 96040f7273e2 ("x86/smp: Eliminate mwait_play_dead_cpuid_hint()") because it introduced a significant power regression on systems that start with "nosmt" in the kernel command line.
Namely, on such systems, SMT siblings permanently go offline early, when cpuidle has not been initialized yet, so after the above commit, hlt_play_dead() is called for them. Later on, when the processor attempts to enter a deep package C-state, including PC10 which is requisite for reaching minimum power in suspend-to-idle, it is not able to do that because of the SMT siblings staying in C1 (which they have been put into by HLT).
As a result, the idle power (including power in suspend-to-idle) rises quite dramatically on those systems with all of the possible consequences, which (needless to say) may not be expected by their users.
This issue is hard to debug and potentially dangerous, so it needs to be addressed as soon as possible in a way that will work for 6.15.y, hence the revert.
Of course, after this revert, the issue that commit 96040f7273e2 attempted to address will be back and it will need to be fixed again later."
The regression was introduced by [4]x86/smp: Eliminate mwait_play_dead_cpuid_hint() that was added during Linux 6.16 Git. That original patch intended to address an issue affecting at least Intel Xeon 6 "Sierra Forest" C-states.
In the [5]power management pull request landing the revert, Rafael characterized it as "a nasty power regression on some systems." That pull request also added in new Rust abstractions for CPUFreq, OPP, CLK, and Cpumasks for allowing more Linux power management code to be written in the Rust programming language moving forward.
[1] https://www.phoronix.com/review/linux-615-nginx-regression
[2] https://www.phoronix.com/review/linux-615-regression-fix
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=70523f335734b0b42f97647556d331edf684c7dc
[4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=96040f7273e2bc0be1871ad9ed4da7b504da9410
[5] https://lore.kernel.org/linux-pm/CAJZ5v0g5C_Zk5-PxsO+W-ef=1oDgbb-PCMYq8UmE9uPi9bASvg@mail.gmail.com/
In addition to some [1]earlier performance regressions in Linux 6.15 that were [2]ultimately fixed in time for the stable release, a late regression for Linux 6.15 ended up yielding significant CPU power regressions. This is present on Linux v6.15 but should be fixed in time for the next point release to Linux 6.15.
Intel engineer and power management subsystem maintainer Rafael Wysocki is also the one that tackled this regression. He explained in [3]the revert that landed in Linux 6.16 Git at the end of the week:
"Revert commit 96040f7273e2 ("x86/smp: Eliminate mwait_play_dead_cpuid_hint()") because it introduced a significant power regression on systems that start with "nosmt" in the kernel command line.
Namely, on such systems, SMT siblings permanently go offline early, when cpuidle has not been initialized yet, so after the above commit, hlt_play_dead() is called for them. Later on, when the processor attempts to enter a deep package C-state, including PC10 which is requisite for reaching minimum power in suspend-to-idle, it is not able to do that because of the SMT siblings staying in C1 (which they have been put into by HLT).
As a result, the idle power (including power in suspend-to-idle) rises quite dramatically on those systems with all of the possible consequences, which (needless to say) may not be expected by their users.
This issue is hard to debug and potentially dangerous, so it needs to be addressed as soon as possible in a way that will work for 6.15.y, hence the revert.
Of course, after this revert, the issue that commit 96040f7273e2 attempted to address will be back and it will need to be fixed again later."
The regression was introduced by [4]x86/smp: Eliminate mwait_play_dead_cpuid_hint() that was added during Linux 6.16 Git. That original patch intended to address an issue affecting at least Intel Xeon 6 "Sierra Forest" C-states.
In the [5]power management pull request landing the revert, Rafael characterized it as "a nasty power regression on some systems." That pull request also added in new Rust abstractions for CPUFreq, OPP, CLK, and Cpumasks for allowing more Linux power management code to be written in the Rust programming language moving forward.
[1] https://www.phoronix.com/review/linux-615-nginx-regression
[2] https://www.phoronix.com/review/linux-615-regression-fix
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=70523f335734b0b42f97647556d331edf684c7dc
[4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=96040f7273e2bc0be1871ad9ed4da7b504da9410
[5] https://lore.kernel.org/linux-pm/CAJZ5v0g5C_Zk5-PxsO+W-ef=1oDgbb-PCMYq8UmE9uPi9bASvg@mail.gmail.com/
Guiorgy