News: 0001562268

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Open-Source & Rust-Written Burn MATMUL Kernels Can Compete With NVIDIA's CUDA/cuBLAS

([Programming] 5 Hours Ago Burn)


The open-source and Rust-based Burn deep learning framework developed by Tracel AI shared that their open-source matrix multiplication kernel performance can compete with and even outperform the NVIDIA CUDA cuBLAS performance. Plus Burn isn't limited to just NVIDIA GPUs but can work on most hardware/drivers, including a Vulkan back-end.

On Friday the Burn developers published a lengthy blog post going over their exciting MATMUL kernel performance relative to NVIDIA CUDA cuBLAS/CUTLASS and showing some really splendid results for this cross-platform, Rust open-source DL framework.

For those wanting to get straight to the exciting part:

"On CUDA, our Simple algorithm is remarkably fast and stable, nearly always outperforming the cuBLAS/CUTLASS reference. However, the MultiRow variant truly stands out in the end; it is also the top performer across the board on Vulkan."

Some really enticing data. Those wanting to learn more about the Burn MATMUL kernel performance can see [1]the Burn.dev blog post .

I haven't looked at Burn previously until a Phoronix reader pointed it out but I'll be checking out their open-source software for use in some possible future benchmarks, namely [2]burn-bench .



[1] https://burn.dev/blog/sota-multiplatform-matmul/

[2] https://github.com/tracel-ai/burn-bench



oleid

It would be nice if the Food and Drug Administration stopped issuing warnings
about toxic substances and just gave me the names of one or two things still
safe to eat.
-- Robert Fuoss