OpenCL 3.1.1 Released To Address A Possible Performance Regression
([Standards] 6 Hours Ago
OpenCL 3.1.1)
- Reference: 0001635248
- News link: https://www.phoronix.com/news/OpenCL-3.1.1-Released
- Source link:
Released earlier this month was [1]the OpenCL 3.1 specification with a focus on enhancing AI and HPC workloads for this long-time Khronos specification. Out today is OpenCL 3.1.1 as a point release with an emphasis on addressing a possible performance regression of OpenCL 3.1.
OpenCL 3.1.1 reverts the short-lived OpenCL 3.1 behavior of clGetEventInfo returning CL_COMPLETE as a host synchronization point. Those wanting a host synchronization point should instead call a function waiting on the OpenCL event instead like with clWaitForEvents . The [2]pull request argued the change in behavior back to its OpenCL 3.0 semantics to avoid a performance regression with the cost of host synchronization:
"This PR changes the behavior of clGetEventInfo(CL_EVENT_COMMAND_EXECUTION_STATUS) returning CL_COMPLETE back to the behavior in OpenCL 3.0. This avoids a potential performance regression when the stronger host synchronization point is not needed, for example to determine if the event is CL_COMPLETE to query event profiling data."
OpenCL 3.1.1 also reserves some enum blocks for forthcoming Intel and Qualcomm extensions. Plus a few other minor fixes but the main change and motivating this quick point release is for reverting the clGetEventInfo behavior.
The OpenCL 3.1.1 spec can be found on [3]GitHub .
[1] https://www.phoronix.com/news/OpenCL-3.1-Released
[2] https://github.com/KhronosGroup/OpenCL-Docs/pull/1558
[3] https://github.com/KhronosGroup/OpenCL-Docs/releases/tag/v3.1.1
OpenCL 3.1.1 reverts the short-lived OpenCL 3.1 behavior of clGetEventInfo returning CL_COMPLETE as a host synchronization point. Those wanting a host synchronization point should instead call a function waiting on the OpenCL event instead like with clWaitForEvents . The [2]pull request argued the change in behavior back to its OpenCL 3.0 semantics to avoid a performance regression with the cost of host synchronization:
"This PR changes the behavior of clGetEventInfo(CL_EVENT_COMMAND_EXECUTION_STATUS) returning CL_COMPLETE back to the behavior in OpenCL 3.0. This avoids a potential performance regression when the stronger host synchronization point is not needed, for example to determine if the event is CL_COMPLETE to query event profiling data."
OpenCL 3.1.1 also reserves some enum blocks for forthcoming Intel and Qualcomm extensions. Plus a few other minor fixes but the main change and motivating this quick point release is for reverting the clGetEventInfo behavior.
The OpenCL 3.1.1 spec can be found on [3]GitHub .
[1] https://www.phoronix.com/news/OpenCL-3.1-Released
[2] https://github.com/KhronosGroup/OpenCL-Docs/pull/1558
[3] https://github.com/KhronosGroup/OpenCL-Docs/releases/tag/v3.1.1