NVIDIA Mellanox Linux Driver Spearheads Multi-Path PCI As "A Sign Of Things To Come"
([NVIDIA] 91 Minutes Ago
Linux 6.12 RDMA)
- Reference: 0001493630
- News link: https://www.phoronix.com/news/Linux-6.12-RDMA-Multi-Path
- Source link:
While open-source enthusiasts like to criticize NVIDIA for not maintaining upstream, in-tree kernel graphics driver support (though [1]things have been changing there ), for other areas of their vast hardware portfolio they are much better upstream Linux kernel citizens and often at the forefront of new driver innovations. One of the leading examples of that is around the NVIDIA Mellanox networking driver support. With Linux 6.12 they've landed a new feature that has been described as " a sign of things to come, I think we will see more of this in the next 10 years. "
Merged to the Linux 6.12 kernel Git source tree today were the RDMA subsystem updates. Besides Broadcom and HiSilicon continuing to work on upstream new hardware support, a notable change from [2]the Git merge is the NVIDIA Mellanox "mlx5" driver adding multi-path PCI support.
This multi-path PCI support is for multi-path direct memory access (DMA) with the adapters supported by the MLX5 driver. The feature is explained by the involved NVIDIA engineers as:
"This patch series aims to enable multi-path DMA support, allowing an mlx5 RDMA device to issue DMA commands through multiple paths. This feature is critical for improving performance and reaching line rate in certain environments where issuing PCI transactions over one path may be significantly faster than over another. These differences can arise from various PCI generations in the system or the specific system topology.
To achieve this functionality, we introduced a data direct DMA device that can serve the RDMA device by issuing DMA transactions on its behalf."
So now with Linux 6.12, this feature is in place for the NVIDIA Mellanox MLX5 driver.
In the [3]RDMA pull request it was described by NVIDIA's Jason Gunthorpe as:
"The new multipath PCI feature is a sign of things to come, I think we will see more of this in the next 10 years."
Which makes sense given the increasing complexities and demands of HPC/AI servers, PCIe speeds continuing to rise, and swirl of other factors while trying to run as efficiently and close to the line rate as possible.
[1] https://www.phoronix.com/news/Ben-Skeggs-Joins-NVIDIA
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=54d7e8190ecfe72ff0dab96545e782f7298cb69a
[3] https://lore.kernel.org/lkml/20240923171614.GA53576@nvidia.com/
Merged to the Linux 6.12 kernel Git source tree today were the RDMA subsystem updates. Besides Broadcom and HiSilicon continuing to work on upstream new hardware support, a notable change from [2]the Git merge is the NVIDIA Mellanox "mlx5" driver adding multi-path PCI support.
This multi-path PCI support is for multi-path direct memory access (DMA) with the adapters supported by the MLX5 driver. The feature is explained by the involved NVIDIA engineers as:
"This patch series aims to enable multi-path DMA support, allowing an mlx5 RDMA device to issue DMA commands through multiple paths. This feature is critical for improving performance and reaching line rate in certain environments where issuing PCI transactions over one path may be significantly faster than over another. These differences can arise from various PCI generations in the system or the specific system topology.
To achieve this functionality, we introduced a data direct DMA device that can serve the RDMA device by issuing DMA transactions on its behalf."
So now with Linux 6.12, this feature is in place for the NVIDIA Mellanox MLX5 driver.
In the [3]RDMA pull request it was described by NVIDIA's Jason Gunthorpe as:
"The new multipath PCI feature is a sign of things to come, I think we will see more of this in the next 10 years."
Which makes sense given the increasing complexities and demands of HPC/AI servers, PCIe speeds continuing to rise, and swirl of other factors while trying to run as efficiently and close to the line rate as possible.
[1] https://www.phoronix.com/news/Ben-Skeggs-Joins-NVIDIA
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=54d7e8190ecfe72ff0dab96545e782f7298cb69a
[3] https://lore.kernel.org/lkml/20240923171614.GA53576@nvidia.com/
phoronix