News: 0001638424

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Linux DRM Ioctl Developed By AMD Being Disabled Following Ongoing Security Issue

([Linux Kernel] 3 Hours Ago DRM Change Handle ioctl)


It's unfortunately another busy week in the [1]Linux 7.1 kernel space with not everything slowing down so well, late in the cycle and leading to the upcoming 7.1 stable release. This week's DRM pull request of kernel graphics/accelerator drivers is again heavy on fixes and also ends up disabling an ioctl interface given ongoing security concerns from that code merged last year.

David Airlie of Red Hat wrote in Friday's [2]pull request of DRM fixes for Linux 7.1:

"Weekly drm fixes, not contributing to things settling down unfortunately, lots of driver fixes for various bounds checks, leaks and UAF type things, i915/xe probably the most sane, amdgpu has a mix of fixes all over, then ethosu has lots of small fixes.

The problem of fixing thing in private has really hit us with the change handle ioctl, and "Sima was right" and we should have disabled the ioctl, since it was only introduced a couple of kernels ago and failed to upstream it's tests in time. The patch here fixes the problems Sima identified, but disables the ioctl as well, with a list of known problems in it and a request for proper tests to be written and upstreamed. It's a niche user ioctl designed for CRIU with AMD ROCm, so I think it's fine to just disable it.

Maybe this week will settle down,

Dave."

Besides the ongoing surplus of Linux fixes continuing to flow in for Linux 7.1, the ioctl drama is the other matter... It pertains to drm_gem_change_handle_ioctl() as a DRM PRIME interface to re-assign GEM handles. This is an interface pursued by AMD engineers as part of their Checkpoint and Restore in User-Space ( [3]CRIU ) initiative. CRIU needs to be able to create or import a buffer object with a specific GEM handle. This interface was devised by AMD engineers last year.

This work was done by AMD to allow freezing a running app/container of ROCm compute workloads and saving its state to be restored later, such as when performing live migration or snapshotting purposes.

Earlier this year with [4]CVE-2026-23149 a security vulnerability was identified to allow user-space to trigger kernel warnings using it. That was attempted to be fixed in January but didn't go as planned.

Following other off-mailing-list discussions due to the security nature of it, Simona Vetter now has attempted to fix the interface a fourth time, ultimately disabling it for the time being. Vetter explained in [5]the patch queued this week for Linux 7.1:

"On-list because the cat is out of the bag and we're clearly not good enough to figure this out in private. The story thus far:

5e28b7b9 ("drm: Set old handle to NULL before prime swap in change_handle") tried to fix a race condition between the gem_close and gem_change_handle ioctls, but got a few things wrong:

- There's a confusion with the local variable handle, which is actually the new handle, and so the two-stage trick was actually applied to the wrong idr slot. 7164d785 ("drm/gem: fix race between change_handle and handle_delete") tried to fix that by adding yet another code block, but forgot to add the error handling. Which meant we now have two paths, both kinda wrong.

- dc366607 ("drm: Replace old pointer to new idr") tried to apply another fix, but inconsistently, again because of the handle confusion

- this would be the right fix (kinda, somewhat, it's a mess) if we'd do the two-stage approach for the new handle. Except that wasn't the intent of the original fix.

We also didn't have an igt merged for the original ioctl, which is a big no-go. This was attempted to address off-list in the original bugfix, and amd QA people claimed the bug was fixed now. Very clearly that's not the case. Here's my attempt to sort this out:

- Rename the local variable to new_handle, the old aliasing with args->handle is just too dangerously confusing.

- Merge the gem obj lookup with the two-stage idr_replace so that we avoid getting ourselves confused there.

- This means we don't have a surplus temporary reference anymore, only an inherited from the idr. A concurrent gem_close on the new_handle could steal that. Fix that with the same two-stage approach create_tail uses. This is a bit overkill as documented in the comment, but I also don't trust my ability to understand this all correctly, so go with the established pattern we have from other ioctls instead for maximum paranoia.

- Adjust error paths. I've tried to make the error and success paths common, because they are identical except for which handle is removed and on which we call idr_replace to (re)install the object again. But that made things messier to read, so I've left it at the more verbose version, which unfortunately hides the symmetry in the entire code flow a bit.

- While at it, also replace the 7 space indent with 1 tab.

And finally, because I flat out don't trust my abilities here at all anymore:

- Disable the ioctl until we have the igt situation and everything else sorted out on-list and with full consensus.

v2:

Sashiko noticed that I didn't handle the error path for idr_replace correctly, it must be checked with IS_ERR_OR_NULL like in gem_handle_delete. So yeah, definitely should just the existing paths 1:1 because this is endless amounts of tricky."

So now the DRM GEM change handle ioctl is disabled for security reasons until proper tests are added and ensuring proper handling for different values.

At least though this shouldn't have too much impact outside of those interested in checkpoint/restore for AMD ROCm compute workloads.

This ioctl disabling is also marked for back-porting to stable Linux kernel versions. Hopefully the cleaned-up and tested code will allow this interface to be re-enabled soon.



[1] https://www.phoronix.com/search/Linux+7.1

[2] https://lore.kernel.org/dri-devel/CAPM=9tx-LsD-MVyNdO4uQ-of16HNt2JsbJYQqRCOMUzzH5bVrg@mail.gmail.com/T/#u

[3] https://www.phoronix.com/search/CRIU

[4] https://nvd.nist.gov/vuln/detail/CVE-2026-23149

[5] https://gitlab.freedesktop.org/drm/kernel/-/commit/1a4f03d22fb655e5f192244fb2c87d8066fcfca2



"Logic and practical information do not seem to apply here."
"You admit that?"
"To deny the facts would be illogical, Doctor"
-- Spock and McCoy, "A Piece of the Action", stardate unknown