AMD Upstreams Efficient Malloc Support On GPUs For LLVM libc
([LLVM] 3 Hours Ago
GPU malloc)
- Reference: 0001550932
- News link: https://www.phoronix.com/news/LLVM-libc-GPU-malloc-Upstream
- Source link:
AMD compiler engineer Joseph Huber is the one who [1]ported DOOM to run on GPUs atop ROCm + LLVM libc as part of [2]taking standard C/C++ code to run on GPUs and more recently has also been pursuing [3]Flang/Fortran support atop GPUs . The latest in this ongoing quest is implementing efficient malloc support for memory allocation support on GPUs via the LLVM libc library.
Joseph Huber upstreamed efficient malloc support on GPUs into upstream LLVM libc. He explained in [4]the commit that landed a few days ago:
"This is the big patch that implements an efficient device-side `malloc` on the GPU. This is the first pass and many improvements will be made later.
The scheme revolves around using a global reference counted pointer to hand out access to a dynamically created and destroyed slab interface. The slab is simply a large bitfield with one bit for each slab. All allocations are the same size in a slab, so different sized allocations are done through different slabs.
Allocation is thus searching for or creating a slab for the desired slab, reserving space, and then searching for a free bit. Freeing is clearing the bit and then releasing the space.
This interface allows memory to dynamically grow and shrink. Future patches will have different modes to allow fast first-time-use as well as a non-RPC version."
Nice seeing all the upstreaming work that AMD is carrying out and continued progress toward allowing more unmodified code to run on GPUs.
These latest commits will be part of the LLVM 21 release around September.
[1] https://www.phoronix.com/news/DOOM-ROCm-LLVM-Port
[2] https://www.phoronix.com/news/AMD-Standard-C-Code-GPUs
[3] https://www.phoronix.com/news/AMD-Flang-RT-Build-On-GPU
[4] https://github.com/llvm/llvm-project/commit/b4bc8c6f83e3
Joseph Huber upstreamed efficient malloc support on GPUs into upstream LLVM libc. He explained in [4]the commit that landed a few days ago:
"This is the big patch that implements an efficient device-side `malloc` on the GPU. This is the first pass and many improvements will be made later.
The scheme revolves around using a global reference counted pointer to hand out access to a dynamically created and destroyed slab interface. The slab is simply a large bitfield with one bit for each slab. All allocations are the same size in a slab, so different sized allocations are done through different slabs.
Allocation is thus searching for or creating a slab for the desired slab, reserving space, and then searching for a free bit. Freeing is clearing the bit and then releasing the space.
This interface allows memory to dynamically grow and shrink. Future patches will have different modes to allow fast first-time-use as well as a non-RPC version."
Nice seeing all the upstreaming work that AMD is carrying out and continued progress toward allowing more unmodified code to run on GPUs.
These latest commits will be part of the LLVM 21 release around September.
[1] https://www.phoronix.com/news/DOOM-ROCm-LLVM-Port
[2] https://www.phoronix.com/news/AMD-Standard-C-Code-GPUs
[3] https://www.phoronix.com/news/AMD-Flang-RT-Build-On-GPU
[4] https://github.com/llvm/llvm-project/commit/b4bc8c6f83e3
miskol