News: 0001460279

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Llamafile 0.8.1 GPU LLM Offloading Works Now With More AMD GPUs

([Radeon] 3 Hours Ago Llamafile 0.8.1)


It was just a few days ago that [1]Llamafile 0.8 released with LLaMA 3 and Grok support along with faster F16 performance. Now this project out of Mozilla for self-contained, easily re-distributable large language model (LLM) deployments is out with a new release.

Most significant with Friday's Llamafile 0.8.1 release is getting GPU support working for more AMD graphics processors / accelerators. Due to some of the AMD offload code within Llamafile only assuming numeric "GFX" graphics IP version identifiers and not alpha-numeric, GPU offload was mistakenly broken for a number of AMD Instinct / Radeon parts. For hardware like the Instinct MI250 with the GFX90A IP, the "A" was not being correctly parsed and not passed to the HIP compiler. In turn this would error out and break Llamafile GPU acceleration on AMD GPUs having non-numeric characters as part of their GFX identifier. That's now fixed up with Llamafile 0.8.1 and thus AMD GPU acceleration working on more hardware for Llamafile-based large language model deployments.

Additionally, Llamafile 0.8.1 now ships pre-built NVIDIA and AMD ROCk modules for both Windows and Linux users for further easing the deployment of Llamafile single-file LLMs that support both CPU and GPU execution.

Llamafile 0.8.1 also adds support for the Phi-3 Mini 4k model, fixed a bug causing GPU model crashes. support for Command-R Plus has proper 64-bit indexing, and other fixes.

Downloads and more details on the new Llamafile 0.8.1 release via [2]Mozilla-Ocho on GitHub .



[1] https://www.phoronix.com/news/Llamafile-0.8-LLaMA3

[2] https://github.com/Mozilla-Ocho/llamafile/releases/tag/0.8.1



geerge

Gentlemen,
Whilst marching from Portugal to a position which commands the
approach to Madrid and the French forces, my officers have been
diligently complying with your requests which have been sent by H.M. ship
from London to Lisbon and thence by dispatch to our headquarters.
We have enumerated our saddles, bridles, tents and tent poles,
and all manner of sundry items for which His Majesty's Government holds
me accountable. I have dispatched reports on the character, wit, and
spleen of every officer. Each item and every farthing has been accounted
for, with two regrettable exceptions for which I beg your indulgence.
Unfortunately the sum of one shilling and ninepence remains
unaccounted for in one infantry battalion's petty cash and there has been
a hideous confusion as the number of jars of raspberry jam issued to
one cavalry regiment during a sandstorm in western Spain. This
reprehensible carelessness may be related to the pressure of circumstance,
since we are war with France, a fact which may come as a bit of a surprise
to you gentlemen in Whitehall.
This brings me to my present purpose, which is to request
elucidation of my instructions from His Majesty's Government so that I
may better understand why I am dragging an army over these barren plains.
I construe that perforce it must be one of two alternative duties, as
given below. I shall pursue either one with the best of my ability, but
I cannot do both:
1. To train an army of uniformed British clerks in Spain for the
benefit of the accountants and copy-boys in London or perchance:
2. To see to it that the forces of Napoleon are driven out of Spain.
-- Duke of Wellington, to the British Foreign Office,
London, 1812