News: 0001502138

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Sched_ext Scheduler Idle Selection Being Extended For LLC & NUMA Awareness

([Linux Kernel] 3 Hours Ago sched_ext NUMA Awareness)


While the [1]sched_ext extensible scheduler code was merged for Linux 6.12, work on sched_ext itself it is not over. New patches this weekend continue working on NUMA awareness for it with its default idle selection policy while similar work on CPU last level cache (LLC) awareness are slated for the upcoming Linux 6.13 cycle.

Queued last week within [2]sched_ext.git's "for-6.13" branch is [3]a patch to introduce LLC awareness to the default idle selection policy. By leveraging the Linux kernel's scheduler topology information, LLC awareness is added to the idle selection policy.

"This allows schedulers using the built-in policy to make more informed decisions when selecting an idle CPU in systems with multiple LLCs, such as NUMA systems or chiplet-based architectures, and it helps keep tasks within the same LLC domain, thereby improving cache locality.

For efficiency, LLC awareness is applied only to tasks that can run on all the CPUs in the system for now. If a task's affinity is modified from user space, it's the responsibility of user space to choose the appropriate optimized scheduling domain."

That LLC awareness for sched_ext will in turn be introduced with Linux 6.13. Andrea Righi of NVIDIA introduced that support.

Andrea Righi has also been working on adding NUMA awareness to the default idle selection code too. That code is still undergoing code review but the latest work there was [4]posted Sunday to the Linux kernel mailing list. That code extends the built-in idle CPU selection policy to prioritize CPUs within the same NUMA node. Righi explains in that patch:

"With this change applied, the built-in CPU idle selection policy follows this logic:

- always prioritize CPUs from fully idle SMT cores,

- select the same CPU if possible,

- select a CPU within the same LLC domain,

- select a CPU within the same NUMA node.

Both NUMA and LLC awareness features are enabled only when the system has multiple NUMA nodes or multiple LLC domains.

In the future, we may want to improve the NUMA node selection to account the node distance from prev_cpu. Currently, the logic only tries to keep tasks running on the same NUMA node. If all CPUs within a node are busy, the next NUMA node is chosen randomly."

We'll see if that NUMA awareness is ready in time for the upcoming Linux 6.13 merge window to join the LLC awareness support. In any event there continues to be a lot of interesting developments and adoption around sched_ext now that it's mainlined.



[1] https://www.phoronix.com/search/sched_ext

[2] https://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext.git/log/?h=for-6.13

[3] https://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext.git/commit/?h=for-6.13&id=dfa4ed29b18c5f26cd311b0da7f049dbb2a2b33b

[4] https://lore.kernel.org/lkml/20241027174953.49655-1-arighi@nvidia.com/



skeevy420

pegasus

A1B2C3

Violence stinks, no matter which end of it you're on. But now and then
there's nothing left to do but hit the other person over the head with a
frying pan. Sometimes people are just begging for that frypan, and if we
weaken for a moment and honor their request, we should regard it as
impulsive philanthropy, which we aren't in any position to afford, but
shouldn't regret it too loudly lest we spoil the purity of the deed.
-- Tom Robbins