NVIDIA's New Linux Patches For GPU Direct RDMA For Device-Private Pages
([Linux Kernel] 6 Hours Ago
P2P DMA For GPU-Centric Apps)
- Reference: 0001509367
- News link: https://www.phoronix.com/news/NVIDIA-Linux-P2P-DMA-RDMA-Priv
- Source link:
NVIDIA engineer Yonatan Maman posted a set of "request for comments" patches this Sunday to implement GPU Direct RDMA "P2P DMA" for device private pages. This is the latest in the effort by multiple vendors to allow more efficient data sharing between GPUs/accelerators and other devices like network adapters.
Getting data to/from GPUs/accelerators and NICs or other devices more efficiently is an ongoing effort being pursued by the large compute vendors, hyperscalers, and others. The RFC patches today are for GPU Direct RDMA handling for device private pages. In this context it's for sharing directly between NVIDIA GPUs and NICs. For the purposes of this testing for upstream, NVIDIA has adapted the open-source Nouveau kernel graphics driver and the Mellanox MLX5 network driver. Support for other drivers could come with time as well and surely NVIDIA's official (out-of-tree) driver is among their support plans bot for upstream acceptance of the code, Nouveau is an upstream open-source user.
Yonatan Maman explained of the [1]RFC patch series :
"This patch series aims to enable Peer-to-Peer (P2P) DMA access in GPU-centric applications that utilize RDMA and private device pages. This enhancement reduces data transfer overhead by allowing the GPU to directly expose device private page data to devices such as NICs, eliminating the need to traverse system RAM, which is the native method for exposing device private page data."
This should be a nice efficiency and latency win. Besides the Nouveau and MLX5 driver changes, some memory management changes are also needed but all-in this RFC patch series touches less than 200 lines of code.
[1] https://lore.kernel.org/dri-devel/20241201103659.420677-1-ymaman@nvidia.com/
Getting data to/from GPUs/accelerators and NICs or other devices more efficiently is an ongoing effort being pursued by the large compute vendors, hyperscalers, and others. The RFC patches today are for GPU Direct RDMA handling for device private pages. In this context it's for sharing directly between NVIDIA GPUs and NICs. For the purposes of this testing for upstream, NVIDIA has adapted the open-source Nouveau kernel graphics driver and the Mellanox MLX5 network driver. Support for other drivers could come with time as well and surely NVIDIA's official (out-of-tree) driver is among their support plans bot for upstream acceptance of the code, Nouveau is an upstream open-source user.
Yonatan Maman explained of the [1]RFC patch series :
"This patch series aims to enable Peer-to-Peer (P2P) DMA access in GPU-centric applications that utilize RDMA and private device pages. This enhancement reduces data transfer overhead by allowing the GPU to directly expose device private page data to devices such as NICs, eliminating the need to traverse system RAM, which is the native method for exposing device private page data."
This should be a nice efficiency and latency win. Besides the Nouveau and MLX5 driver changes, some memory management changes are also needed but all-in this RFC patch series touches less than 200 lines of code.
[1] https://lore.kernel.org/dri-devel/20241201103659.420677-1-ymaman@nvidia.com/
cend