AMD's AOMP 19.0-2 Compiler Brings Zero-Copy For CPU-GPU Unified Shared Memory
([AMD] 6 Hours Ago
AOMP 19.0-2)
- Reference: 0001473664
- News link: https://www.phoronix.com/news/AMD-AOMP-19.0-2-Compiler
- Source link:
AMD compiler engineers have released AOMP 19.0-2 as the newest version of their downstream LLVM/Clang compiler that carries all of their latest work around OpenMP/AOCC GPU device offloading to Radeon and Instinct hardware. With this updated AOMP compiler is now run-time support for zero-copy with CPU-GPU unified shared memory and various other new features for this GPU/accelerator-focused compiler.
The big new feature of AOMP 19.0-2 is "significant" run-time feature work for supporting zero-copy for CPU-GPU unified shared memory. Implicit zero-copy can be done most optimally with OpenMP parallel programming on the AMD Instinct MI300A APUs. Implicit zero-copy can also be done on the MI200/MI300X and other discrete AMD GPUs by running the application(s) in an XNACK-enabled environment and setting the "HSA_XNACK=1 OMPX_APU_MAPS=1" environment variables.
AOMP 19.0-2 also re-bases against the latest LLVM 19 Git codebase, builds from the ROCm 6.1.2 sources, and has "significant" improvements to its gpurun utility. The gpurun helper CLI program now supports multiple accelerators/GPUs, heterogeneous devices, and other features. AOMP 19.0-2 is also now capable of handling FP16 and BFloat16 reductions.
The AOMP 19.0-2 source code can be downloaded as well as pre-built Ubuntu / RHEL / SUSE Linux Enterprise binaries of this compiler. More details over on [1]GitHub .
[1] https://github.com/ROCm/aomp/releases/tag/rel_19.0-2
The big new feature of AOMP 19.0-2 is "significant" run-time feature work for supporting zero-copy for CPU-GPU unified shared memory. Implicit zero-copy can be done most optimally with OpenMP parallel programming on the AMD Instinct MI300A APUs. Implicit zero-copy can also be done on the MI200/MI300X and other discrete AMD GPUs by running the application(s) in an XNACK-enabled environment and setting the "HSA_XNACK=1 OMPX_APU_MAPS=1" environment variables.
AOMP 19.0-2 also re-bases against the latest LLVM 19 Git codebase, builds from the ROCm 6.1.2 sources, and has "significant" improvements to its gpurun utility. The gpurun helper CLI program now supports multiple accelerators/GPUs, heterogeneous devices, and other features. AOMP 19.0-2 is also now capable of handling FP16 and BFloat16 reductions.
The AOMP 19.0-2 source code can be downloaded as well as pre-built Ubuntu / RHEL / SUSE Linux Enterprise binaries of this compiler. More details over on [1]GitHub .
[1] https://github.com/ROCm/aomp/releases/tag/rel_19.0-2
Jumbotron