News: 0001500534

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Significant CRC32C Throughput Optimization On The Way To The Linux Kernel

([Linux Kernel] 2 Hours Ago Faster CRC32C Performance)


Google engineer Eric Biggers has worked on some very nice performance optimizations for the crypto code within the Linux kernel such as [1]faster AES-GCM for Intel and AMD CPUs , [2]much faster AES-XTS disk/file encryption with modern CPUs, and [3]many other optimizations over the years. His latest work is on enhancing the CRC32C crypto performance for x86/x86_64 processors.

Biggers has patches pending to eliminate the jump table and excessive unrolling found within the CRC32C Assembly code used on modern Intel/AMD processors. He explains in [4]this patch within his crypto-pending branch:

"crc32c-pcl-intel-asm_64.S has a loop with 1 to 127 iterations full unrolled and uses a jump table to jump into the correct location. This optimization is misguided, as it bloats the binary code size and introduces an indirect call. x86_64 CPUs can predict loops well, so it is fine to just use a loop instead. Loop bookkeeping instructions can compete with the crc instructions for the ALUs, but this is easily mitigated by unrolling the loop by a smaller amount, such as 4 times.

Therefore, re-roll the loop and make related tweaks to the code.

This reduces the binary code size of crc_pclmul() from 4546 bytes to 418 bytes, a 91% reduction. In general it also makes the code faster, with some large improvements seen when retpoline is enabled."

With the default (Retpoline enabled) state for Intel and AMD CPUs, there is as much as a 66% throughput boost on Intel Emerald Rapids while AMD Zen 2 is even seeing as much as a 29% throughput improvement. Some real nice wins:

Hopefully this new code will be buttoned up in time for the upcoming Linux v6.13 kernel cycle for boosting the CRC32C kernel crypto performance for modern Intel and AMD processors.



[1] https://www.phoronix.com/news/AES-GCM-Intel-AMD-Linux-6.11

[2] https://www.phoronix.com/news/Linux-6.10-Crypto

[3] https://www.phoronix.com/search/Eric+Biggers

[4] https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux.git/commit/?h=crypto-pending&id=84004e2996a00fdf527d9269fe33c0b254427f1f



microchip8

ayumu

baka0815

V1tol

bezirg

Listen, there is no courage or any extra courage that I know of to find out
the right thing to do. Now, it is not only necessary to do the right thing,
but to do it in the right way and the only problem you have is what is the
right thing to do and what is the right way to do it. That is the problem.
But this economy of ours is not so simple that it obeys to the opinion of
bias or the pronouncements of any particular individual, even to the President.
This is an economy that is made up of 173 million people, and it reflects
their desires, they're ready to buy, they're ready to spend, it is a thing
that is too complex and too big to be affected adversely or advantageously
just by a few words or any particular -- say, a little this and that, or even
a panacea so alleged.
-- D. D. Eisenhower, in response to: "Has the government
been lacking in courage and boldness in facing up to
the recession?"