News: 0001524504

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Intel's OpenVINO 2025.0 Brings Support For Deepseek Models, Better AI Performance

([Intel] 3 Hours Ago OpenVINO 2025.0)


Intel's software engineers working on the OpenVINO AI toolkit today released OpenVINO 2025.0 that brings support for the much talked about Deepseek models along with other large language models (LLMs), performance improvements to some of the existing model support, and other changes.

New model support with Intel's OpenVINO 2025.0 open-source AI toolkit include Qwen 2.5, Deepseek-R1-Distill-Llama-8B, DeepSeek-R1-Distill-Qwen-7B, and DeepSeek-R1-Distill-Qwen-1.5B, FLUX.1 Schnell, and FLUX.1 Dev.

OpenVINO 2025.0 also delivers on better whisper model performance on CPUs, integrated GPUs, and discrete GPUs with OpenVINO's GenAI API. Plus there is initial Intel NPU support for torch.compile for using the PyTorch API on Intel NPUs.

The OpenVINO 2025.0 also brings improvements for second token latency for LLMs, KV cache compression is now enabled for INT8 on CPUs, support for Core Ultra 200H "Arrow Lake H" processors, OpenVINO backend support with the Triton Inference Server, and the OpenVINO Model Server can now work natively on Windows Server deployments.

Downloads and more details on the just-released OpenVINO 2025.0 via [1]GitHub . I'll have new [2]OpenVINO benchmarks and [3]OpenVINO GenAI benchmarks soon on Phoronix.



[1] https://github.com/openvinotoolkit/openvino/releases/tag/2025.0.0

[2] https://openbenchmarking.org/test/pts/openvino

[3] https://openbenchmarking.org/test/pts/openvino-genai



Jumbotron

Jumbotron

Diplomacy is the art of saying "nice doggie" until you can find a rock.
-- Wynn Catlin