Intel Releases OpenVINO 2024.2 With Llama 3 Optimizations, More AVX2 & AVX-512 Optimizations

([Intel] 4 Hours Ago OpenVINO 2024.2)

Reference: 0001471458
News link: https://www.phoronix.com/news/OpenVINO-2024.2-Released
Source link:

Intel today released OpenVINO 2024.2, the newest version of its open-source AI toolkit for optimizing and deploying deep learning (A) inference models across a range of AI frameworks and broad hardware types.

With OpenVINO 2024.2 they have continued optimizing for Meta's Llama 3 large language model. OpenVINO 2024.2 brings more Llama 3 optimizations for execution across CPUs, integrated GPUs, and discrete GPUs to further enhance performance while yielding more efficient memory use too.

OpenVINO 2024.2 also adds support for Phi-3-mini AI models, broader large language model support, support for Intel Atom Processor X Series, preview support for Intel Xeon 6 processors, and more AVX2/AVX-512 tuning. Intel is seeing a "significant improvement" in second token latency and memory footprint of FP16 weight LLMs for AVX2 on Intel Core CPus and then AVX-512 with Intel Xeon processors when leveraging small batch sizes.

Downloads and more details on the OpenVINO 2024.2 release via [1]GitHub .

[1] https://github.com/openvinotoolkit/openvino/releases/tag/2024.2.0

News: 0001471458

Intel Releases OpenVINO 2024.2 With Llama 3 Optimizations, More AVX2 & AVX-512 Optimizations

phoronix