Mesa 24.3 Sees "Substantial Improvement" To AMD Clear/Copy-Buffer Compute Shader
([Mesa] 6 Hours Ago
Optimized Clear / Copy Buffer Shader)
- Reference: 0001484707
- News link: https://www.phoronix.com/news/AMD-Optimize-Clear-CB-Shader
- Source link:
Well known AMD Mesa developer Marek Olšák continues relentlessly optimizing the RadeonSI Gallium3D driver and related code for ensuring the AMD graphics stack can reach peak performance.
Recently Marek has been working to optimize the clear/copy_buffer compute shader into AMD common code and as part of it adding support for unaligned copies.
In the merge request opened a few weeks ago Marek describes this as a "substantial improvement" that since overnight has been merged for Mesa 24.3. Marek notes in the [1]merge request :
"This is a substantial improvement of the clear/copy_buffer compute shader in radeonsi, which is also moved to src/amd/common.
This adds support for unaligned buffer clears and copies while maintaining the same performance as aligned clears and copies. The optimal alignment for buffer offsets is 256, not 4.
More chip-specific tuning will follow, but this is already optimal for Navi31."
Great to see more of Marek's optimizations ready for Mesa Git. It will be interesting to see what more tuning Marek achieves in time for Mesa 24.3 stable due out in Q4.
[1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208
Recently Marek has been working to optimize the clear/copy_buffer compute shader into AMD common code and as part of it adding support for unaligned copies.
In the merge request opened a few weeks ago Marek describes this as a "substantial improvement" that since overnight has been merged for Mesa 24.3. Marek notes in the [1]merge request :
"This is a substantial improvement of the clear/copy_buffer compute shader in radeonsi, which is also moved to src/amd/common.
This adds support for unaligned buffer clears and copies while maintaining the same performance as aligned clears and copies. The optimal alignment for buffer offsets is 256, not 4.
More chip-specific tuning will follow, but this is already optimal for Navi31."
Great to see more of Marek's optimizations ready for Mesa Git. It will be interesting to see what more tuning Marek achieves in time for Mesa 24.3 stable due out in Q4.
[1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208
peterdk