Microsoft Researchers Develop Hyper-Efficient AI Model That Can Run On CPUs

(Thursday April 17, 2025 @11:30PM (BeauHD) from the resource-constrained dept.)

Reference: 0177056787
News link: https://slashdot.org/story/25/04/17/2224205/microsoft-researchers-develop-hyper-efficient-ai-model-that-can-run-on-cpus
Source link:

Microsoft has introduced BitNet b1.58 2B4T, the largest-scale 1-bit AI model to date with 2 billion parameters and the [1]ability to run efficiently on CPUs . It's [2]openly available under an MIT license. TechCrunch reports:

> The Microsoft researchers say that BitNet b1.58 2B4T is the first bitnet with 2 billion parameters, "parameters" being largely synonymous with "weights." Trained on a dataset of 4 trillion tokens -- equivalent to about 33 million books, by one estimate -- BitNet b1.58 2B4T outperforms traditional models of similar sizes, the researchers claim.

>

> BitNet b1.58 2B4T doesn't sweep the floor with rival 2 billion-parameter models, to be clear, but it seemingly holds its own. According to the researchers' testing, the model surpasses Meta's Llama 3.2 1B, Google's Gemma 3 1B, and Alibaba's Qwen 2.5 1.5B on benchmarks including GSM8K (a collection of grade-school-level math problems) and PIQA (which tests physical commonsense reasoning skills). Perhaps more impressively, BitNet b1.58 2B4T is speedier than other models of its size -- in some cases, twice the speed -- while using a fraction of the memory.

>

> There is a catch, however. Achieving that performance requires using Microsoft's custom framework, bitnet.cpp, which only works with certain hardware at the moment. Absent from the list of supported chips are GPUs, which dominate the AI infrastructure landscape.

[1] https://techcrunch.com/2025/04/16/microsoft-researchers-say-theyve-developed-a-hyper-efficient-ai-model-that-can-run-on-cpus/

[2] https://huggingface.co/microsoft/bitnet-b1.58-2B-4T

Trained on about 33 million books .. (Score:4, Insightful)

by Mirnotoriety ( 10462951 )

BitNet .. Trained on a dataset of 4 trillion tokens -- equivalent to about 33 million books ..

Have the human writers of the books been compensated?

Re: (Score:1)

by Tablizer ( 95088 )

Bitnet is Skynet's little bot sister.

Re: (Score:2)

by presidenteloco ( 659168 )

Don't worry. Most of the training data was actually just dumb web comments.

Re: (Score:2)

by thegarbz ( 1787294 )

Did you compensate me for reading this text I just wrote? Well actually... I guess it really depends on if you get the point of what I just asked. If you get the point and learned something from it I suppose you now owe me compensation? If on the other hand you don't understand the difference between copying something and inferring something from content then I guess you don't owe me anything either.

Re: Trained on about 33 million books .. (Score:2)

by Big Hairy Gorilla ( 9839972 )

The difference is the speed and scale of the AI Ingesting and leveraging such a large amount of content. You can't do that. 33 million books.

Re: (Score:2)

by Roger W Moore ( 538166 )

Well, unless they are using illegal, pirated copies I'd assume the author was compensated when the copy of the book that they used was purchased. In the same way that an author has been compensated when you borrow a book from the library to read yourself.

Re: (Score:2)

by martin-boundary ( 547041 )

When you grow up and leave your mom's basement, you'll realize that food and rent isn't free. Then you'll understand what an economy is. HTH.

Re: (Score:2)

by Mr. Dollar Ton ( 5495648 )

> Have the authors of the textbooks you read been compensated by the knowledge you applied later in life?

If you learned well, usually yes. You helped the economy grow, and that would have made their lives better as well.

Not powers of 2? (Score:1)

by Tablizer ( 95088 )

> TFA: Bitnets quantize weights into just three values: -1, 0, and 1. In theory, that makes them far more efficient [on limited hardware]

Why 3? If the resolution is whatever one chooses (at the expense of more nodes), then wouldn't powers of two be more efficient on a regular CPU? "3" wastes numeric real-estate it seems.

Does it need an odd number in order to have a zero, a "central" value? Even then, it seems higher resolution nodes would have to sacrifice less to get achieve that.

Re: (Score:2)

by molarmass192 ( 608071 )

I want to know how a single bit, which, by design, can ***ONLY*** be 0 or 1, manages to add a sign bit without actually adding a sign bit. Did bits grow a 3rd state when I wasn't paying attention?

Re: Not powers of 2? (Score:1)

by Tschaine ( 10502969 )

It works out to about 1.58 bits of information per value, hence the "1.58b" in the name.

It's a little too complicated to type up here, but a web search for "1.58 bit LLM" will turn up some interesting reading.

Re: Not powers of 2? (Score:1)

by Tschaine ( 10502969 )

Whoops, I mean the "b1.58" in the name.

Prior use (Score:1)

by Shag ( 3737 )

Uh, back in the 80s, BITnet ran on IBM mainframes and VAXen.

Alpha software (Score:2)

by Neuroelectronic ( 643221 )

The paper on bitnet.cpp recommends running the model on limited cores" to see the advertised efficiency improvements, and failed running on the i7 despite having the same 64g ram as the tested Apple M2. However they claim to want to expand the supported hardware to Apple and Android phones.

[1]https://arxiv.org/abs/2504.122... [arxiv.org]

[1] https://arxiv.org/abs/2504.12285

Running on CPU is not that hard... (Score:2)

by Froze ( 398171 )

I have llama3:8b running CPU only on my laptop. Sure, it's a little slow, but very usable. Am I missing something here?

Re: (Score:2)

by thegarbz ( 1787294 )

The only thing you're missing the the start of your own second sentence and it's relevance in a world where AI is involved in everything you do. Imagine having to say "sure it's a little slow" for everything you do with your PC. You'll quickly get frustrated. Fine in a world where you fire up an LLM once for shits and giggles, not so much fun when you use it extensively and continuously.

Re: (Score:2)

by drinkypoo ( 153816 )

I have an extremely middle of the road PC, per the Steam survey, with just a little more CPU than average; It's a 5900X with a 4060 Ti 16GB. I have 64GB RAM, which is about double the average, but not too expensive for DDR4 (which most users still have) so most people could upgrade to it if they wanted. My favorite model right now is gemma3, it only runs about twice as fast on my GPU as it does on my CPU in the very respectably useful 12b variant. The 27b version (which is too big to fit on my cheapass GPU)

News: 0177056787

Microsoft Researchers Develop Hyper-Efficient AI Model That Can Run On CPUs

Trained on about 33 million books .. (Score:4, Insightful)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: Trained on about 33 million books .. (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Not powers of 2? (Score:1)

Re: (Score:2)

Re: Not powers of 2? (Score:1)

Re: Not powers of 2? (Score:1)

Prior use (Score:1)

Alpha software (Score:2)

Running on CPU is not that hard... (Score:2)

Re: (Score:2)

Re: (Score:2)