Kernel Shader Variants Merged For Mesa's Rusticl OpenCL Driver
([Mesa] 14 Minutes Ago
Kernel Shader Variants)
- Reference: 0001490913
- News link: https://www.phoronix.com/news/Rusticl-Shader-Variants
- Source link:
The latest Rust-written OpenCL driver " [1]Rusticl " work by Red Hat engineer Karol Herbst is support for shader variants and introducing an optimized kernel variant.
Karol explained of Rusticl's kernel shader variants effort in [2]the merge request as:
"Something I always wanted to do. This allows us to compile shader variants to get some perf optimizations going for cases like offsets being 0 and we won't have to emit pointless ALU operations."
In the commit introducing an optimized kernel variant he went on to add:
"By default we have to take into account that the application could set offsets, or that one kernel launch won't fit into a single hw dispatch.
In order to mitigate the overhead it causes at kernel runtime, and because those things are in most cases irrelevant, we compile an optimized kernel making a few assumptions.
We also make use of the the workgroup_size_hint as an additional optimization.
This should speed up relatively small kernels significantly as it can cut the instruction count in half for those."
No benchmark numbers were provided as part of the merge, but it's looking like this shader variants effort should help moving forward for this open-souce, cross-vendor OpenCL driver.
I'll try to find the time to work on some fresh Rusticl benchmarks soon. This code was merged for next quarter's Mesa 24.3 release.
[1] https://www.phoronix.com/search/Rusticl
[2] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30152
Karol explained of Rusticl's kernel shader variants effort in [2]the merge request as:
"Something I always wanted to do. This allows us to compile shader variants to get some perf optimizations going for cases like offsets being 0 and we won't have to emit pointless ALU operations."
In the commit introducing an optimized kernel variant he went on to add:
"By default we have to take into account that the application could set offsets, or that one kernel launch won't fit into a single hw dispatch.
In order to mitigate the overhead it causes at kernel runtime, and because those things are in most cases irrelevant, we compile an optimized kernel making a few assumptions.
We also make use of the the workgroup_size_hint as an additional optimization.
This should speed up relatively small kernels significantly as it can cut the instruction count in half for those."
No benchmark numbers were provided as part of the merge, but it's looking like this shader variants effort should help moving forward for this open-souce, cross-vendor OpenCL driver.
I'll try to find the time to work on some fresh Rusticl benchmarks soon. This code was merged for next quarter's Mesa 24.3 release.
[1] https://www.phoronix.com/search/Rusticl
[2] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30152
phoronix