Intel QATlib 24.09 Improves Performance For Multi-Threaded Apps & Multi-Socket Servers
([Intel] 3 Hours Ago
Intel QATlib 24.09)
- Reference: 0001490790
- News link: https://www.phoronix.com/news/Intel-QATlib-24.09
- Source link:
Intel today released a new version of QATlib, the QuickAssist Technology library for enjoying hardware-accelerated offloading of security, authentication, and compression needs. Recent Intel Xeon CPUs with built-in QAT accelerators stand to benefit a lot from the new QATlib 24.09 release.
Today's QATlib 24.09 release improves performance scaling for multi-threaded applications when using the new "--enable-icp-thread-specific-usdm" option. This option enables USDM allocates and handles memory specific to threads and makes use of thread local storage. The documentation goes on to explain:
"USDM allocates and handles memory specific to threads. (For multi-thread apps, allocated memory information will be maintained separately for each thread; employs thread local storage feature i.e. TLS. It avoids locking that was needed when a global data structure being used in non thread-specific implementation). NOTE: Any memory allocated by a thread must be freed by the same thread. If it passes the memory to other threads for use, it's responsible for any synchronisation between those threads. The thread which did the allocation must live until after all threads using the memory are finished with it, as any thread memory not yet freed may be cleaned up on termination of the thread."
The other exciting aspect of Intel QATlib 24.09 is setting the core affinity mapping based on NUMA node to improve performance for multi-socket platforms. QATlib can now enjoy near-liner scaling across multi-socket Intel Xeon servers.
Plus there are various bug fixes in this QATlib library update. Downloads and more details via [1]Intel on GitHub .
[1] https://github.com/intel/qatlib/releases/tag/24.09.0
Today's QATlib 24.09 release improves performance scaling for multi-threaded applications when using the new "--enable-icp-thread-specific-usdm" option. This option enables USDM allocates and handles memory specific to threads and makes use of thread local storage. The documentation goes on to explain:
"USDM allocates and handles memory specific to threads. (For multi-thread apps, allocated memory information will be maintained separately for each thread; employs thread local storage feature i.e. TLS. It avoids locking that was needed when a global data structure being used in non thread-specific implementation). NOTE: Any memory allocated by a thread must be freed by the same thread. If it passes the memory to other threads for use, it's responsible for any synchronisation between those threads. The thread which did the allocation must live until after all threads using the memory are finished with it, as any thread memory not yet freed may be cleaned up on termination of the thread."
The other exciting aspect of Intel QATlib 24.09 is setting the core affinity mapping based on NUMA node to improve performance for multi-socket platforms. QATlib can now enjoy near-liner scaling across multi-socket Intel Xeon servers.
Plus there are various bug fixes in this QATlib library update. Downloads and more details via [1]Intel on GitHub .
[1] https://github.com/intel/qatlib/releases/tag/24.09.0
lejeczek