LLVM's DTLTO Now More Efficiently Adding Files To The Link For Much Better Performance
([LLVM] 6 Hours Ago
Distributed ThinLTO)
- Reference: 0001623312
- News link: https://www.phoronix.com/news/LLVM-DTLTO-Faster-Files-Link
- Source link:
Last year [1]LLVM began landing their Distributed ThinLTO "DTLTO" support as an enhancement to their ThinLTO approach for link-time optimizations. An improvement merged this week to LLVM addresses a performance bottleneck discovered when adding files to the link.
Ben Dunbobbin made an improvement for better performance when adding files to the link with DTLTO. At an extreme on Windows with an AMD Ryzen 16-core processor, adding a file to the link was taking ~2799 ms while with the now-merged code has dropped to just 157 ms. Or under Linux the patch reduces the time from ~255 ms down to just ~41 ms.
Ben explained with [2]the patch improving the performance:
"The in-process ThinLTO backend typically generates object files in memory and adds them directly to the link, except when the ThinLTO cache is in use. DTLTO is unusual in that it adds files to the link from disk in all cases.
When the ThinLTO cache is not in use, ThinLTO adds files via an AddStreamFn callback provided by the linker, which ultimately appends to a SmallVector in LLD. When the cache is in use, the linker supplies an AddBufferFn callback that adds files more efficiently (by moving MemoryBuffer ownership).
This patch allows clients to optionally provide an AddBufferFn to the DTLTO ThinLTO backend. When available, the backend uses this to add files to the link more efficiently.
For a Clang link (Debug build with sanitizers and instrumentation) using an optimized toolchain (PGO non-LTO, llvmorg-22.1.0), measuring the mean Add DTLTO files to the link time trace scope duration:
- On Windows (Windows 11 Pro Build 26200, AMD Family 25 @ ~4.5 GHz, 16 cores/32 threads, 64 GB RAM), this patch reduces the mean from 2799.148 ms to 157.972 ms.
- On Linux (Ubuntu 24.04.3 LTS Kernel 6.14, Ryzen 9 5950X, 16 cores/32 threads, boost up to 5.09 GHz, 64 GB RAM), this patch reduces the mean from 255.291 ms to 41.630 ms."
Quite big relative improvements from just reworking a few dozen lines of code for this change in LLVM/Clang 23.
[1] https://www.phoronix.com/news/LLVM-DTLTO-Distributed-Thin
[2] https://github.com/llvm/llvm-project/commit/80b304d14beec80381bd11eb19099bb54d213092
Ben Dunbobbin made an improvement for better performance when adding files to the link with DTLTO. At an extreme on Windows with an AMD Ryzen 16-core processor, adding a file to the link was taking ~2799 ms while with the now-merged code has dropped to just 157 ms. Or under Linux the patch reduces the time from ~255 ms down to just ~41 ms.
Ben explained with [2]the patch improving the performance:
"The in-process ThinLTO backend typically generates object files in memory and adds them directly to the link, except when the ThinLTO cache is in use. DTLTO is unusual in that it adds files to the link from disk in all cases.
When the ThinLTO cache is not in use, ThinLTO adds files via an AddStreamFn callback provided by the linker, which ultimately appends to a SmallVector in LLD. When the cache is in use, the linker supplies an AddBufferFn callback that adds files more efficiently (by moving MemoryBuffer ownership).
This patch allows clients to optionally provide an AddBufferFn to the DTLTO ThinLTO backend. When available, the backend uses this to add files to the link more efficiently.
For a Clang link (Debug build with sanitizers and instrumentation) using an optimized toolchain (PGO non-LTO, llvmorg-22.1.0), measuring the mean Add DTLTO files to the link time trace scope duration:
- On Windows (Windows 11 Pro Build 26200, AMD Family 25 @ ~4.5 GHz, 16 cores/32 threads, 64 GB RAM), this patch reduces the mean from 2799.148 ms to 157.972 ms.
- On Linux (Ubuntu 24.04.3 LTS Kernel 6.14, Ryzen 9 5950X, 16 cores/32 threads, boost up to 5.09 GHz, 64 GB RAM), this patch reduces the mean from 255.291 ms to 41.630 ms."
Quite big relative improvements from just reworking a few dozen lines of code for this change in LLVM/Clang 23.
[1] https://www.phoronix.com/news/LLVM-DTLTO-Distributed-Thin
[2] https://github.com/llvm/llvm-project/commit/80b304d14beec80381bd11eb19099bb54d213092