AVX-512 Performance With 256-bit vs. 512-bit Data Path For AMD EPYC 9005 CPUs
([Processors] 81 Minutes Ago
4 Comments)
- Reference: 0001497726
- News link: https://www.phoronix.com/review/amd-epyc-9755-avx512
- Source link:
Now past the launch day for the [1]AMD EPYC 9005 series server processors and having delivered [2]initial AMD EPYC Zen 5 benchmarks for the EPYC 9575F / EPYC 9755 / EPYC 9965 SKUs, it's onto one of my favorite areas of testing and that is the more focused benchmarks looking at different specific changes/features of new processors. Today under the benchmarking microscope is looking at the new AVX-512 512-bit data path capabilities of 5th Gen AMD EPYC compared to using a 256-bit data path or disabling AVX-512 entirely.
[3]
With the 5th Gen AMD EPYC "Turin" processors they all now enjoy a 512-bit data path for faster Advanced Vector Extensions 512 usage. Zen 4 with AMD's original AVX-512 implementation relied on a 256-bit "double pumped" approach that worked well and proved to be very efficient. Now with Zen 5 depending upon the CPU it can either be a 256-bit data path or a native 512-bit data path as we've seen with the Ryzen 9000 series desktop CPUs and now with the EPYC 9005 server processors. With the BIOS of the AMD EPYC "Turin" servers it's possible to toggle AVX-512 not only on/off but the ability to select either using a 256-bit or 512-bit data path.
[4]
With the ability from the BIOS/firmware to downgrade the FP512 data path to an FP256 data path makes for some interesting benchmarking comparisons plus the ability to disable AVX-512 outright as well.
[5]
With a single AMD EPYC 9755 "Zen 5" processor installed, I ran benchmarks of various AVX-512 capable workloads on Ubuntu 24.04 LTS in the default configuration of AVX-512 with a 512-bit data path, then repeating the benchmarks with AVX-512 in a 256-bit data path configuration, and then finally with AVX-512 disabled for reference purposes.
While carrying out these AVX-512 comparison benchmarks for AMD EPYC Zen 5, I was also monitoring the CPU package power consumption for total CPU power plus performance-per-Watt. In looking for any impact from the AVX-512 usage the peak CPU frequency was monitored along with the CPU package temperature.
[1] https://www.phoronix.com/review/amd-epyc-9005
[2] https://www.phoronix.com/review/amd-epyc-9965-9755-benchmarks
[3] https://www.phoronix.com/image-viewer.php?id=amd-epyc-9755-avx512&image=epyc_turin_avx512_1_lrg
[4] https://www.phoronix.com/image-viewer.php?id=amd-epyc-9755-avx512&image=epyc_turin_avx512_3_lrg
[5] https://www.phoronix.com/image-viewer.php?id=amd-epyc-9755-avx512&image=epyc_turin_avx512_2_lrg
[3]
With the 5th Gen AMD EPYC "Turin" processors they all now enjoy a 512-bit data path for faster Advanced Vector Extensions 512 usage. Zen 4 with AMD's original AVX-512 implementation relied on a 256-bit "double pumped" approach that worked well and proved to be very efficient. Now with Zen 5 depending upon the CPU it can either be a 256-bit data path or a native 512-bit data path as we've seen with the Ryzen 9000 series desktop CPUs and now with the EPYC 9005 server processors. With the BIOS of the AMD EPYC "Turin" servers it's possible to toggle AVX-512 not only on/off but the ability to select either using a 256-bit or 512-bit data path.
[4]
With the ability from the BIOS/firmware to downgrade the FP512 data path to an FP256 data path makes for some interesting benchmarking comparisons plus the ability to disable AVX-512 outright as well.
[5]
With a single AMD EPYC 9755 "Zen 5" processor installed, I ran benchmarks of various AVX-512 capable workloads on Ubuntu 24.04 LTS in the default configuration of AVX-512 with a 512-bit data path, then repeating the benchmarks with AVX-512 in a 256-bit data path configuration, and then finally with AVX-512 disabled for reference purposes.
While carrying out these AVX-512 comparison benchmarks for AMD EPYC Zen 5, I was also monitoring the CPU package power consumption for total CPU power plus performance-per-Watt. In looking for any impact from the AVX-512 usage the peak CPU frequency was monitored along with the CPU package temperature.
[1] https://www.phoronix.com/review/amd-epyc-9005
[2] https://www.phoronix.com/review/amd-epyc-9965-9755-benchmarks
[3] https://www.phoronix.com/image-viewer.php?id=amd-epyc-9755-avx512&image=epyc_turin_avx512_1_lrg
[4] https://www.phoronix.com/image-viewer.php?id=amd-epyc-9755-avx512&image=epyc_turin_avx512_3_lrg
[5] https://www.phoronix.com/image-viewer.php?id=amd-epyc-9755-avx512&image=epyc_turin_avx512_2_lrg