News: 0001500534

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Significant CRC32C Throughput Optimization On The Way To The Linux Kernel

([Linux Kernel] 2 Hours Ago Faster CRC32C Performance)


Google engineer Eric Biggers has worked on some very nice performance optimizations for the crypto code within the Linux kernel such as [1]faster AES-GCM for Intel and AMD CPUs , [2]much faster AES-XTS disk/file encryption with modern CPUs, and [3]many other optimizations over the years. His latest work is on enhancing the CRC32C crypto performance for x86/x86_64 processors.

Biggers has patches pending to eliminate the jump table and excessive unrolling found within the CRC32C Assembly code used on modern Intel/AMD processors. He explains in [4]this patch within his crypto-pending branch:

"crc32c-pcl-intel-asm_64.S has a loop with 1 to 127 iterations full unrolled and uses a jump table to jump into the correct location. This optimization is misguided, as it bloats the binary code size and introduces an indirect call. x86_64 CPUs can predict loops well, so it is fine to just use a loop instead. Loop bookkeeping instructions can compete with the crc instructions for the ALUs, but this is easily mitigated by unrolling the loop by a smaller amount, such as 4 times.

Therefore, re-roll the loop and make related tweaks to the code.

This reduces the binary code size of crc_pclmul() from 4546 bytes to 418 bytes, a 91% reduction. In general it also makes the code faster, with some large improvements seen when retpoline is enabled."

With the default (Retpoline enabled) state for Intel and AMD CPUs, there is as much as a 66% throughput boost on Intel Emerald Rapids while AMD Zen 2 is even seeing as much as a 29% throughput improvement. Some real nice wins:

Hopefully this new code will be buttoned up in time for the upcoming Linux v6.13 kernel cycle for boosting the CRC32C kernel crypto performance for modern Intel and AMD processors.



[1] https://www.phoronix.com/news/AES-GCM-Intel-AMD-Linux-6.11

[2] https://www.phoronix.com/news/Linux-6.10-Crypto

[3] https://www.phoronix.com/search/Eric+Biggers

[4] https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux.git/commit/?h=crypto-pending&id=84004e2996a00fdf527d9269fe33c0b254427f1f



microchip8

ayumu

baka0815

V1tol

bezirg

What you want, what you're hanging around in the world waiting for, is for
something to occur to you.
-- Robert Frost

[Quoted in "VMS Internals and Data Structures", V4.4, when
referring to AST's.]