News: 0001484412

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

OpenBLAS 0.3.28 Brings More Optimizations, Meteor Lake & Emerald Rapids Support

([Programming] 3 Hours Ago OpenBLAS 0.3.28)


OpenBLAS 0.3.28 made it out today as the open-source optimized BLAS library that caters to a wide range of processors spanning various architectures. With this OpenBLAS 0.3.28 release are yet more optimizations and new CPU optimized paths.

OpenBLAS 0.3.28 reworks its "HUGETLB" implementation from GotoBLAS, improves multi-threaded GEMM performance for certain matrices, improved BLAS3 performance on large multi-core systems via enhanced parallelism, improved performance of initial memory allocation, and a range of other common optimizations and fixes.

OpenBLAS 0.3.28 also brings official support for Intel Xeon Emerald Rapids and Intel Core Ultra (Meteor Lake) processors. There is also now auto-detection of Zhaoxin KX-7000 CPUs, fixing auto-detection for old Intel Prescott CPUs, improved compiler options for CMake and LLVM builds on AVX-512 capable targets, and other x86_64 optimizations.

Over on the ARM64 side is improved GEMM performance on the Arm Neoverse V1, new optimized kernels for the A64FX, and other changes. There are also a number of LoongArch, RISC-V, and POWER optimizations too in this BLAS library update.

Downloads and more details on OpenBLAS 0.3.28 for this leading open-source BLAS implementation via [1]GitHub .



[1] https://github.com/OpenMathLib/OpenBLAS/releases/tag/v0.3.28



phoronix

College football is a game which would be much more interesting if the faculty
played instead of the students, and even more interesting if the trustees
played. There would be a great increase in broken arms, legs, and necks,
and simultaneously an appreciable diminution in the loss to humanity.
-- H. L. Mencken