Intel Releases OpenVINO 2026 With Improved NPU Handling, Expanded LLM Support
([Intel] 5 Hours Ago
OpenVINO 2026.0)
- Reference: 0001615509
- News link: https://www.phoronix.com/news/Intel-OpenVINO-2026.0-Released
- Source link:
Intel's open-source OpenVINO AI toolkit is out with its first major release of 2026. With today's OpenVINO 2026.0 release there is expanded large language model (LLM) support, improved Intel NPU support for Core Ultra systems, and a variety of other enhancements for benefiting Intel's CPU / NPU / GPU range of products for AI.
OpenVINO 2026.0 adds support for CPU and GPU execution of the GPT-OSS-20B, MiniCPM-V-4_5-8B, and MiniCPM-o-2.6 models. It's a bit surprising it took them until now to formally support OpenAI's GPT-OSS-20B but in any event it's now supported with OpenVINO 2026.0.
For NPUs with smaller models there is also now support for MiniCPM-o-2.6, Qwen2.5-1B-Instruct, Qwen3-Embedding-0.6B, and Qwen-2.5-coder-0.5B.
OpenVINO GenAI meanwhile added support for word-level timestamps to enhance the functionality for more accurate transcriptions and subtitling to better position itself with the OpenAI and FasterWhisper implementations. OpenVINO 2026.0 also supports int4 data-aware weight compression for 3D MatMuls for MoE LLMs to run with lower memory/bandwidth requirements and improved accuracy. There is also now VLM pipeline support to enhance Agentic AI framework integration with OpenVINO GenAI, the OpenVINO GenAI code also now supports speculative decoding on NPUs for better performance, and various other improvements.
The OpenVINO 2026.0 release also enhances the Intel Core Ultra NPU support by providing compiler integration with the NPU plug-in to support ahead-of-time and on-device compilation without depending upon OEM driver updates. Intel's aim here is to provide "a single, ready-to-ship package that reduces integration friction and accelerates time-to-value."
Downloads and more details on the OpenVINO 2026.0 release via [1]GitHub . I'll have out new [2]OpenVINO benchmarks and [3]OpenVINO GenAI benchmarks soon.
[1] https://github.com/openvinotoolkit/openvino/releases/tag/2026.0.0
[2] https://openbenchmarking.org/test/pts/openvino#results
[3] https://openbenchmarking.org/test/pts/openvino-genai#results
OpenVINO 2026.0 adds support for CPU and GPU execution of the GPT-OSS-20B, MiniCPM-V-4_5-8B, and MiniCPM-o-2.6 models. It's a bit surprising it took them until now to formally support OpenAI's GPT-OSS-20B but in any event it's now supported with OpenVINO 2026.0.
For NPUs with smaller models there is also now support for MiniCPM-o-2.6, Qwen2.5-1B-Instruct, Qwen3-Embedding-0.6B, and Qwen-2.5-coder-0.5B.
OpenVINO GenAI meanwhile added support for word-level timestamps to enhance the functionality for more accurate transcriptions and subtitling to better position itself with the OpenAI and FasterWhisper implementations. OpenVINO 2026.0 also supports int4 data-aware weight compression for 3D MatMuls for MoE LLMs to run with lower memory/bandwidth requirements and improved accuracy. There is also now VLM pipeline support to enhance Agentic AI framework integration with OpenVINO GenAI, the OpenVINO GenAI code also now supports speculative decoding on NPUs for better performance, and various other improvements.
The OpenVINO 2026.0 release also enhances the Intel Core Ultra NPU support by providing compiler integration with the NPU plug-in to support ahead-of-time and on-device compilation without depending upon OEM driver updates. Intel's aim here is to provide "a single, ready-to-ship package that reduces integration friction and accelerates time-to-value."
Downloads and more details on the OpenVINO 2026.0 release via [1]GitHub . I'll have out new [2]OpenVINO benchmarks and [3]OpenVINO GenAI benchmarks soon.
[1] https://github.com/openvinotoolkit/openvino/releases/tag/2026.0.0
[2] https://openbenchmarking.org/test/pts/openvino#results
[3] https://openbenchmarking.org/test/pts/openvino-genai#results