News: 0001585101

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Valve Developer Contributes Major Improvement To RADV Vulkan For Llama.cpp AI

([AI] 3 Hours Ago RADV Boost For Llama.cpp)


Valve's Linux graphics driver team contributions aren't limited to just enhancing the rasterization and ray-tracing graphics performance of the open-source Linux GPU drivers for gaming. Beyond other interesting contributions from that talented group of open-source Linux graphics developers over the years and for other areas like [1]enhancing old GPU hardware support , merged this week for the Radeon Vulkan "RADV" driver is a massive improvement to benefit the Llama.cpp AI performance.

Rhys Perry of Valve's Linux graphics team who specializes on the RADV Radeon Vulkan driver and ACO compiler has contributed a significant improvement to further enhance the Vulkan-backend performance of Llama.cpp AI inferencing on AMD Radeon hardware.

Opened last week was the merge request [2]radv: use CU mode when LDS is used . While that alone isn't enough to excite end-user interest, the merge request message was simply:

"This improves performance of llama.cpp."

Okay... But no additional context to the performance improvement for Llama.cpp.

Fortunately, thanks to Adriano Martins, is some additional insight and ends up making this merge extremely interesting. Adriano commented:

"this makes radv now fly past amdvlk and rocm with llama.cpp, albeit for prompt processing only"

Not only is the RADV for Llama.cpp past processing faster than the official (former) AMDVLK Vulkan driver but also ROCm.

[3]

Pretty great results going from around 3586 tokens/s with Llama 7B Q4 to around 4046 tokens/s with these patches for pp512. Or around a 13% improvement from these three patches at least for Llama 7B.

[4]LLama.cpp with Vulkan was already performing well on Radeon GPUs while now should be even better.

This merge happened just in time for making it into [5]next month's Mesa 25.3 stable release .

Valve's open-source developers/contractors do amazing work for the Linux software ecosystem beyond just gaming as we've shown many times over the past few years.

New [6]Llama.cpp benchmarks on Phoronix soon.



[1] https://www.phoronix.com/news/Valve-Fixes-2025-Hawaii-GPUs

[2] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37791

[3] https://www.phoronix.com/image-viewer.php?id=2025&image=radv_faster_llama_lrg

[4] https://www.phoronix.com/review/llama-cpp-windows-linux

[5] https://www.phoronix.com/news/Mesa-26.0-Starts-Development

[6] https://openbenchmarking.org/test/pts/llama-cpp



At the hospital, a doctor is training an intern on how to announce bad news
to the patients. The doctor tells the intern "This man in 305 is going to
die in six months. Go in and tell him." The intern boldly walks into the
room, over to the man's bedside and tells him "Seems like you're gonna die!"
The man has a heart attack and is rushed into surgery on the spot. The doctor
grabs the intern and screams at him, "What!?!? are you some kind of moron?
You've got to take it easy, work your way up to the subject. Now this man in
213 has about a week to live. Go in and tell him, but, gently, you hear me,
gently!"
The intern goes softly into the room, humming to himself, cheerily
opens the drapes to let the sun in, walks over to the man's bedside, fluffs
his pillow and wishes him a "Good morning!" "Wonderful day, no? Say...
guess who's going to die soon!"