New FFmpeg AVX-512 Optimizations Hit Up To 36x The Performance Of Plain C Code
([Multimedia] 5 Hours Ago
scene_sad + AVX-512)
- Reference: 0001561904
- News link: https://www.phoronix.com/news/FFmpeg-July-2025-AVX-512
- Source link:
Some commits merged today to FFmpeg Git provide additional hand-tuned Assembly code for [1]AVX-512 with capable Intel and AMD processors.
Open-source multimedia developer Niklas Haas today upstreamed some additional AVX2 and AVX-512 tuning to FFmpeg, on top of the multimedia library's already vast array of hand-tuned code for leveraging Advanced Vector Extensions.
For FFmpeg's avfilter scene_sad code, there is now [2]an AVX-512 implementation added that comes in at 36.31x the speed of the plain C code, according to benchmarks run by Niklas Haas. There was already an AVX2 path that achieved 25x the performance of the common C code but now with AVX-512 is exceeding 36x the performance.
[3]Another commit added high bit depth AVX2 and AVX-512 versions of the scene_sad avfilter code. There is around an 11x improvement over the common C code or around 22x when using AVX-512.
AVX-512 continues to pay off particularly with the latest AMD Zen 4 / Zen 5 and recent Intel Xeon processors.
[1] https://www.phoronix.com/search/AVX-512
[2] https://github.com/FFmpeg/FFmpeg/commit/91f2d146d418d536e14b0d0c2d32f81cb95f6b7f
[3] https://github.com/FFmpeg/FFmpeg/commit/e44a1aaeecc14fc396e0c715969ddd3cc939933d
Open-source multimedia developer Niklas Haas today upstreamed some additional AVX2 and AVX-512 tuning to FFmpeg, on top of the multimedia library's already vast array of hand-tuned code for leveraging Advanced Vector Extensions.
For FFmpeg's avfilter scene_sad code, there is now [2]an AVX-512 implementation added that comes in at 36.31x the speed of the plain C code, according to benchmarks run by Niklas Haas. There was already an AVX2 path that achieved 25x the performance of the common C code but now with AVX-512 is exceeding 36x the performance.
[3]Another commit added high bit depth AVX2 and AVX-512 versions of the scene_sad avfilter code. There is around an 11x improvement over the common C code or around 22x when using AVX-512.
AVX-512 continues to pay off particularly with the latest AMD Zen 4 / Zen 5 and recent Intel Xeon processors.
[1] https://www.phoronix.com/search/AVX-512
[2] https://github.com/FFmpeg/FFmpeg/commit/91f2d146d418d536e14b0d0c2d32f81cb95f6b7f
[3] https://github.com/FFmpeg/FFmpeg/commit/e44a1aaeecc14fc396e0c715969ddd3cc939933d
edxposed