News: 0001620257

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

ARM NEON Accelerated CRC64 Optimization Shows Nearly 6x Improvement

([Linux Kernel] 69 Minutes Ago ARM NEON CRC64)


A patch posted today to the Linux kernel mailing list provides an ARM64-optimized CRC64-NVMe implementation for nearly a 6x improvement on modern Arm SoCs.

Open-source developer Demian Shulhan added this NEON-optimized CRC64 implementation, similar to the other architecture-specific CRC64 implementations such as for x86_64 and RISC-V. The intent on this CRC64 speed-up is for benefiting NVMe and other storage devices in addressing this bottleneck.

Shulhan explained in the patch and the nearly 6x gain was for an Arm Crotex-A72 SoC. He wrote:

"Implement an optimized CRC64 (NVMe) algorithm for ARM64 using NEON Polynomial Multiply Long (PMULL) instructions. The generic shift-and-XOR software implementation is slow, which creates a bottleneck in NVMe and other storage subsystems.

The acceleration is implemented using C intrinsics (arm_neon.h) rather than raw assembly for better readability and maintainability.

Key highlights of this implementation:

- Uses 4KB chunking inside scoped_ksimd() to avoid preemption latency spikes on large buffers.

- Pre-calculates and loads fold constants via vld1q_u64() to minimize register spilling.

- Benchmarks show the break-even point against the generic implementation is around 128 bytes. The PMULL path is enabled only for len >= 128.

- Safely falls back to the generic implementation on Big-Endian systems.

Performance results (kunit crc_benchmark on Cortex-A72):

- Generic (len=4096): ~268 MB/s

- PMULL (len=4096): ~1556 MB/s (nearly 6x improvement)"

It's surprising it took until now to see an ARM64/NEON-optimized CRC64 implementation for the Linux kernel at just a little more than one hundred lines of code.

[1]The patch is now out for review on the Linux kernel mailing list.



[1] https://lore.kernel.org/all/20260317065425.2684093-1-demyansh@gmail.com/



X windows:
The ultimate bottleneck.
Flawed beyond belief.
The only thing you have to fear.
Somewhere between chaos and insanity.
On autopilot to oblivion.
The joke that kills.
A disgrace you can be proud of.
A mistake carried out to perfection.
Belongs more to the problem set than the solution set.
To err is X windows.
Ignorance is our most important resource.
Complex nonsolutions to simple nonproblems.
Built to fall apart.
Nullifying centuries of progress.
Falling to new depths of inefficiency.
The last thing you need.
The defacto substandard.

Elevating brain damage to an art form.
X windows.