News: 0001603427

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Linux Addressing Out-Of-Memory Killer Inaccuracy On Large Core Count Systems

([Linux Kernel] 5 Hours Ago OOM Killer)


A patch is on the way to the Linux kernel and looks like it could be ready for the 6.20~7.0 kernel for addressing out-of-memory "OOM" killer inaccuracy behavior when dealing with large core count systems.

A patch by Linux developer Mathieu Desnoyers made it into Andrew Morton's "mm-everything" queue this week to fix out-of-memory killer inaccuracy on large many-core systems.

In early 2025 it was [1]reported that there were inaccuracies in the OOM killer when dealing with today's high core count systems, at least in the 250+ core/thread count range:

"Recently, several internal services had an RSS usage regression as part of a kernel upgrade. Previously, they were on a pre-6.2 kernel and were able to read RSS statistics in a backup watchdog process to monitor and decide if they'd overrun their memory budget. Now, however, a representative service with five threads, expected to use about a hundred MB of memory, on a 250-cpu machine had memory usage tens of megabytes different from the expected amount -- this constituted a significant percentage of inaccuracy, causing the watchdog to act.

...

This is a really tremendous inaccuracy for any few-threaded program on a large machine and impedes monitoring significantly. These stat counters are also used to make OOM killing decisions, so this additional inaccuracy could make a big difference in OOM situations -- either resulting in the wrong process being killed, or in less memory being returned from an OOM-kill than expected.

Finally, while the change to percpu_counter does significantly improve the accuracy over the previous per-thread error for many-threaded services, it does also have performance implications - up to 12% slower for short-lived processes and 9% increased system time in make test workloads."

[2]This patch working its way to the mainline kernel hopefully for the upcoming Linux 6.20~7.0 cycle should address those inaccuracies.



[1] https://lore.kernel.org/lkml/20250331223516.7810-2-sweettea-kernel@dorminy.me/

[2] https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-everything&id=f85306789224f6862ba8bfc5e046a318a9fd58f7



Obscurism:
The practice of peppering daily life with obscure
references as a subliminal means of showcasing both one's education
and one's wish to disassociate from the world of mass culture.
-- Douglas Coupland, "Generation X: Tales for an Accelerated
Culture"