Open-Source & Rust-Written Burn MATMUL Kernels Can Compete With NVIDIA's CUDA/cuBLAS

([Programming] 5 Hours Ago Burn)

Reference: 0001562268
News link: https://www.phoronix.com/news/Burn-MATMUL-Kernels-CUDA
Source link:

The open-source and Rust-based Burn deep learning framework developed by Tracel AI shared that their open-source matrix multiplication kernel performance can compete with and even outperform the NVIDIA CUDA cuBLAS performance. Plus Burn isn't limited to just NVIDIA GPUs but can work on most hardware/drivers, including a Vulkan back-end.

On Friday the Burn developers published a lengthy blog post going over their exciting MATMUL kernel performance relative to NVIDIA CUDA cuBLAS/CUTLASS and showing some really splendid results for this cross-platform, Rust open-source DL framework.

For those wanting to get straight to the exciting part:

"On CUDA, our Simple algorithm is remarkably fast and stable, nearly always outperforming the cuBLAS/CUTLASS reference. However, the MultiRow variant truly stands out in the end; it is also the top performer across the board on Vulkan."

Some really enticing data. Those wanting to learn more about the Burn MATMUL kernel performance can see [1]the Burn.dev blog post .

I haven't looked at Burn previously until a Phoronix reader pointed it out but I'll be checking out their open-source software for use in some possible future benchmarks, namely [2]burn-bench .

[1] https://burn.dev/blog/sota-multiplatform-matmul/

[2] https://github.com/tracel-ai/burn-bench

News: 0001562268

Open-Source & Rust-Written Burn MATMUL Kernels Can Compete With NVIDIA's CUDA/cuBLAS

oleid