AMD Engineer Leverages AI To Help Make A Pure-Python AMD GPU User-Space Driver
([AMD] 6 Hours Ago
Pure Python AMD GPU Driver)
- Reference: 0001617328
- News link: https://www.phoronix.com/news/AI-Pure-Python-AMD-GPU-Driver
- Source link:
AMD's VP of AI Software, Anush Elangovan, has used Claude Code to help craft a pure-Python AMD GPU user-space driver. This Python user-space driver is currently being created to help exercise other ROCm code and for debugging in passing through the ROCm/HIP user-space stack.
Anush was inspired by Tinygrad's user-space AMD GPU driver implementation and with Claude AI has created a user-space driver for stress testing of SDMA and compute/communications overlap debug. Anush posted on [1]X : " I didn't open the editor once. [AI] Agents are the great equalizer in software. And Speed is the moat. "
With further work, the user-space driver has also been [2]working on compute bound kernel support too.
This user-space driver is currently being developed via [3]this GitHub branch . The [4]initial commit explains as of the first features in place:
"Add pure-Python AMD GPU userspace driver
A standalone Python driver that talks directly to /dev/kfd and /dev/dri/renderD* via ctypes ioctls, bypassing the ROCm/HIP userspace stack. Supports KFD backend with pluggable architecture for future bare-metal PCI (AM) backend.
Features:
- KFD ioctl bindings (queue, memory, events)
- GPU family registry (RDNA2/3/4, CDNA2/3)
- SDMA copy engine with linear copy and fence packets
- PM4 compute packet builder (dispatch, release_mem, etc.)
- Timeline semaphore for GPU-CPU synchronization
- Topology parser for /sys/devices/virtual/kfd/kfd
- ELF code object parser for kernel loading
- 130 tests passing (unit + integration on MI300X/gfx942)
Co-Authored-By: Claude (claude-opus-4-6)"
Over the past two days this pure-Python AMD user-space driver has been extended to include multi-GPU support, compute-bound kernels, and other functionality. It will be interesting to see where work on this Python-based AMD GPU user-space driver leads.
[1] https://x.com/AnushElangovan/status/2029038108273197432
[2] https://x.com/AnushElangovan/status/2029080440980815916
[3] https://github.com/ROCm/TheRock/commits/users/powderluv/userspace-driver/
[4] https://github.com/ROCm/TheRock/commit/ba0c4625751ffa2f7a1d3679421380efa4e13478
Anush was inspired by Tinygrad's user-space AMD GPU driver implementation and with Claude AI has created a user-space driver for stress testing of SDMA and compute/communications overlap debug. Anush posted on [1]X : " I didn't open the editor once. [AI] Agents are the great equalizer in software. And Speed is the moat. "
With further work, the user-space driver has also been [2]working on compute bound kernel support too.
This user-space driver is currently being developed via [3]this GitHub branch . The [4]initial commit explains as of the first features in place:
"Add pure-Python AMD GPU userspace driver
A standalone Python driver that talks directly to /dev/kfd and /dev/dri/renderD* via ctypes ioctls, bypassing the ROCm/HIP userspace stack. Supports KFD backend with pluggable architecture for future bare-metal PCI (AM) backend.
Features:
- KFD ioctl bindings (queue, memory, events)
- GPU family registry (RDNA2/3/4, CDNA2/3)
- SDMA copy engine with linear copy and fence packets
- PM4 compute packet builder (dispatch, release_mem, etc.)
- Timeline semaphore for GPU-CPU synchronization
- Topology parser for /sys/devices/virtual/kfd/kfd
- ELF code object parser for kernel loading
- 130 tests passing (unit + integration on MI300X/gfx942)
Co-Authored-By: Claude (claude-opus-4-6)"
Over the past two days this pure-Python AMD user-space driver has been extended to include multi-GPU support, compute-bound kernels, and other functionality. It will be interesting to see where work on this Python-based AMD GPU user-space driver leads.
[1] https://x.com/AnushElangovan/status/2029038108273197432
[2] https://x.com/AnushElangovan/status/2029080440980815916
[3] https://github.com/ROCm/TheRock/commits/users/powderluv/userspace-driver/
[4] https://github.com/ROCm/TheRock/commit/ba0c4625751ffa2f7a1d3679421380efa4e13478