Intel's New LLM-Scaler Beta Update Brings Whisper Model & GLM-4.5-Air Support

([Intel] 6 Hours Ago llm-scaler-vllm)

Reference: 0001571125
News link: https://www.phoronix.com/news/Intel-llm-scaler-vllm-Whisper
Source link:

Earlier this month [1]Intel released LLM-Scaler 1.0 as part of their Project Battlematrix initiative. This is a Docker container effort to deliver speedy AI inference performance with multi-GPU scaling and PCIe P2P support and more.

While there was the v1.0 announcement earlier this month, yesterday Intel software engineers released "0.9.0-b3" as a new beta release for the llm-scaler-vllm Docker build.

The updated LLM-Scaler vLLM beta enables Whisper model support, GLM-4.5-Air support, enables GLM-4.1V-9B-Thinking for image input, and enables the dots.ocr model. On top of supporting the additional models, yesterday's beta also optimized vLLM memory usage and enables the pipeline parallelism Ray back-end.

[2]

Downloads and more details on the new Intel LLM-Scaler vLLM release via [3]GitHub .

[1] https://www.phoronix.com/news/Intel-LLM-Scaler-1.0

[2] https://www.phoronix.com/image-viewer.php?id=2025&image=llm_scaler_beta_lrg

[3] https://github.com/intel/llm-scaler/releases/tag/vllm-0.9.0-b3

News: 0001571125

Intel's New LLM-Scaler Beta Update Brings Whisper Model & GLM-4.5-Air Support

phoronix