News: 0001560247

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Nouveau NAK Lands A Big Improvement For NVIDIA Kepler GPUs: As Much As 2.5x Faster

([Nouveau] 5 Hours Ago Nouveau NAK)


Merged today to the open-source NVIDIA "NAK" compiler code within Mesa 25.2 is Kepler instruction scheduling. This real instruction scheduling support for GeForce GTX 600/700 "Kepler" graphics processors can provide some significant performance benefits in select workloads.

Lorenzo Rossi opened the merge request last week for adding this instruction scheduling support to the open-source NAK shader compiler code used by the Nouveau and NVK drivers. Today that code was merged for Mesa 25.2 ahead of the code branching / feature freeze next week.

Rossi commented in [1]the merge request :

"This commit adds real instruction scheduling for SM32 (KeplerB) and adds a minor optimization to the generated instructions.

Dual-issue and functional-unit resource tracking are still to be implemented."

The real kicker for end-users though is the NAK patch adding the real instruction dependencies for Kepler. Rossi noted there:

"This commit ports instruction latency information found in codegen emitter. Previously every instruction was delayed by 16 cycles even if it was not necessary.

PixMark Piano is highly affected by instruction latencies and gets a 2.5x boost, other benchmarks still get better performance. The other two missing pieces to get feature parity with codegen are functional unit resource tracking and instruction dual-issue.

Performance measures on a GT770 (with 0f pstate)

Pixmark piano: 519 -> 14526 pts (has rendering issues in both!)

Furmark: 3247 -> 5786 pts

The talos principle (high settings): 30-33 -> 55-60 FPS"

For relevant workloads at least, this is a very significant performance improvement for Nouveau/NAK when using a more than decade old NVIDIA Kepler GPU.



[1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35821



MorrisS.

castlefox

touchdown

aviallon

There was once a programmer who was attached to the court of the
warlord of Wu. The warlord asked the programmer: "Which is easier to design:
an accounting package or an operating system?"
"An operating system," replied the programmer.
The warlord uttered an exclamation of disbelief. "Surely an
accounting package is trivial next to the complexity of an operating
system," he said.
"Not so," said the programmer, "when designing an accounting package,
the programmer operates as a mediator between people having different ideas:
how it must operate, how its reports must appear, and how it must conform to
the tax laws. By contrast, an operating system is not limited my outside
appearances. When designing an operating system, the programmer seeks the
simplest harmony between machine and ideas. This is why an operating system
is easier to design."
The warlord of Wu nodded and smiled. "That is all good and well, but
which is easier to debug?"
The programmer made no reply.
-- Geoffrey James, "The Tao of Programming"