New AMD Linux Driver Patches Posted For Batch Userptr Allocation Support
([Radeon] 6 Hours Ago
Batch Userptr Allocation)
- Reference: 0001603756
- News link: https://www.phoronix.com/news/AMDKFD-Batch-Userptr-Allocation
- Source link:
A new feature being worked on recently for the AMDKFD kernel compute driver is batch user pointer "userptr" allocation support. With this new user-space API it will become possible to support allocating multiple non-contiguous CPU virtual address ranges that map to a single contiguous GPU virtual address.
Building off a prior patch series, today the second iteration of this batch userptr allocation support was posted while now foregoing any changes needed to the Shared Virtual Memory (SVM) subsystem as well as improved Heterogeneous Memory Management (HMM) integration as well as taking a better design approach.
The main focus with this new "AMDKFD_IOC_ALLOC_MEMORY_OF_GPU_BATCH" Interface is for helping efficiently manage scattered memory buffers by having them presented as a unified GPU address space, such as for workloads with fragmented host memory. This solution for batch userptr allocation was found to be much more performant than user-space approaches that would induce greater system call overhead.
From user-space, there is already [1]this patch to ROCm's libhsakmt for making use of this batch userptr API.
Those interested in this latest AMDKFD kernel driver work can find the v2 patches on the [2]mailing list .
[1] https://github.com/ROCm/rocm-systems/commit/ac21716e5d6f68ec524e50eeef10d1d6ad7eae86
[2] https://lore.kernel.org/dri-devel/20260104072122.3045656-1-honglei1.huang@amd.com/#t
Building off a prior patch series, today the second iteration of this batch userptr allocation support was posted while now foregoing any changes needed to the Shared Virtual Memory (SVM) subsystem as well as improved Heterogeneous Memory Management (HMM) integration as well as taking a better design approach.
The main focus with this new "AMDKFD_IOC_ALLOC_MEMORY_OF_GPU_BATCH" Interface is for helping efficiently manage scattered memory buffers by having them presented as a unified GPU address space, such as for workloads with fragmented host memory. This solution for batch userptr allocation was found to be much more performant than user-space approaches that would induce greater system call overhead.
From user-space, there is already [1]this patch to ROCm's libhsakmt for making use of this batch userptr API.
Those interested in this latest AMDKFD kernel driver work can find the v2 patches on the [2]mailing list .
[1] https://github.com/ROCm/rocm-systems/commit/ac21716e5d6f68ec524e50eeef10d1d6ad7eae86
[2] https://lore.kernel.org/dri-devel/20260104072122.3045656-1-honglei1.huang@amd.com/#t