News: 0001518593

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Alibaba Engineers Work To Address Suspend/Resume Bugs With The AMD Graphics Driver

([Radeon] 5 Hours Ago Suspend and Resume)


Alibaba engineers have recently been working through some AMD Linux kernel graphics driver bugs uncovered during suspend-and-resume testing with AMD graphics cards.

Alibaba engineers uncovered a number of resource tracking related bugs within the AMDGPU kernel driver around double buffer frees, use-after-free, and other resource tracking related bugs with the AMDGPU driver's C code. Thanks to the driver being open-source, they took to working through the bugs.

[1]This patch series from Alibaba's Jiang Liu works to enhance the device state machine so the AMDGPU driver better handles the suspend/resume cycles with these bugs that were uncovered.

"Recently we were testing suspend/resume functionality with AMD GPUs, we have encountered several resource tracking related bugs, such as double buffer free, use after free and unbalanced irq reference count.

We have tried to solve these issues case by case, but found that may not be the right way. Especially about the unbalanced irq reference count, there will be new issues appear once we fixed the current known issues. After analyzing related source code, we found that there may be some fundamental implementation flaws behind these resource tracking issues.

...

So we try to fix those issues by two enhancements/refinements to current device management state machines.

...

Then we try to refine each subsystem, such as nbio, asic etc, to follow the new design. Currently we have only taken the nbio and asic as examples to show the proposed changes. Once we have confirmed that's the right way to go, we will handle the lefting subsystems.

This is in early stage and requesting for comments, any comments and suggestions are welcomed!"

Their open-source contribution is now under review and hopefully the patches will be worked through for completing this enhanced device state machine for better suspend/resume handling by the AMD Linux graphics driver.



[1] https://lists.freedesktop.org/archives/amd-gfx/2025-January/118673.html



pokeballs

mdedetrich

wdb974

Nth_man

avis

cen1

crowen

Daktyl198

cb88

Sing hey! for the bath at close of day
That washes the weary mud away!
A loon is he that will not sing:
O! Water Hot is a noble thing!

O! Sweet is the sound of falling rain,
and the brook that leaps from hill to plain;
but better than rain or rippling streams
is Water Hot that smokes and steams.

O! Water cold we may pour at need
down a thirsty throat and be glad indeed;
but better is Beer, if drink we lack,
and Water Hot poured down the back.

O! Water is fair that leaps on high
in a fountain white beneath the sky;
but never did fountain sound so sweet
as splashing Hot Water with my feet!
-- J. R. R. Tolkien