News: 1713228246

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Los Alamos Lab powers up Nvidia-laden Venado supercomputer

(2024/04/16)


Los Alamos National Laboratory (LANL) has flipped the switch on its Venado supercomputer – a machine capable of bringing ten exaFLOPS of performance to bear on AI workloads for the Department of Energy.

Announced at the ISC high performance computing conference in 2022, Venado is [1]among the first supercomputers to be built using Nvidia's Superchip architecture. But before you get too excited about the claimed performance, remember that exaFLOP metric only applies to AI workloads.

As powerful as [2]Venado is, Nvidia hasn't dethroned AMD's 1.1 exaFLOP [3]Frontier system – in fact, it's not even close. Floating point performance has long been the benchmark for supercomputers as seen over the past 30 years of Top500 High Performance Linpack (HPL) runs. But, with the rise of systems tailored to lower precisions and AI workloads, the meaning of the metric has become somewhat muddied.

[4]

Instead of the double precision performance listed on the Top500 ranking, the peak floating point performance rating of many systems designed to run AI workloads is often given at half (FP16) – or even quarter (FP8) – precision.

[5]

[6]

Venado was rated using FP8.

That lofty ten exaFLOP figure was therefore achieved when running under conditions that trade accuracy for higher throughput and lower memory bandwidth. That's perfect for running large language models (LLMs) and other machine learning tasks, but maybe not the best option if you're trying to simulate the criticality of a [7]plutonium warhead .

[8]

Although Venado can't hold a candle to Frontier in FP64 workloads, it's no slouch. Thanks to the presence of Nvidia's H100 GPUs providing the bulk of the system's power, the machine should be able to churn out about 171 petaFLOPs of peak double precision performance – enough to just barely beat out the number 10 ranked system on November's Top500 ranking. Though we'll note actual performance in the HPL is usually a fair bit lower.

"With its ability to incorporate artificial intelligence approaches, we are looking forward to seeing how the Venado system at Los Alamos can deliver new and meaningful results for areas of interest," David Turk, deputy secretary for the Department of Energy, wrote in a [9]statement .

So far LANL says the system, which was delivered last month, has already shown promise running material science and astrophysics simulations. That demonstrates the machine will do its fair share of HPC simulations and handle lower precision AI workloads.

[10]Next-gen Meta AI chip serves up ads while sipping power

[11]Intel Gaudi's third and final hurrah is an AI accelerator built to best Nvidia's H100

[12]GenAI will be bigger than the cloud or the internet, Amazon CEO hopes

[13]AMD hires former Oak Ridge chief to punt AI to governments

Housed at LANL's Nicholas C Metropolis Center for Modeling and Simulation, Venado is a relatively compact system built in collaboration with Nvidia and HPE Cray, using the latter's EX platform and Slingshot 11 interconnects.

The all liquid-cooled system comprises 3480 Nvidia Superchips – including 2,560 GH200 and 920 Grace-Grace CPU modules.

[14]

As we've [15]discussed in the past, the GH200 is essentially a system-on-module aimed at HPC and AI workloads. It features a 72-core Grace CPU which is based on Nvidia's high-end Neoverse V2 cores, 480GB of LPDDR5x memory, and 96 or 144GB H100 GPUs linked together with a 900GB/sec NVLink-C2C interconnect.

Nvidia's Grace CPU Superchips swap the GPU for a second Grace CPU, for a total of 144 cores linked by the same NVLink-C2C interconnect. Those cores are fed by up to 960GB of LPDDR5x memory capable of delivering upwards of 1TB/sec of bandwidth.

According to LANL these Grace CPU Superchips should boost performance for a wide range of HPC applications, especially those that aren't optimized or well suited to GPU accelerators.

While you might think an Arm-based system might mean HPC wonks need to re-skill in a hurry – as our sibling site The Next Platform has previously [16]discussed – the supercomputing community has been working with Arm systems for a while now, dating back to Cavium's ThunderX and Fujitsu's A64FX platforms.

Venado won't even be the largest Grace-Hopper system we see this year. The UK Government's Isambard-AI will be [17]powered by 5448 Nvidia GH200s. Meanwhile, EuroHPC's Jupiter System's GPU partition will [18]pack close to 24,000 Grace-Hopper Superchips. ®

Get our [19]Tech Resources



[1] https://www.theregister.com/2022/05/30/los_alamos_national_laboratory_nvidia/

[2] https://www.nextplatform.com/2024/04/15/los-alamos-pushes-the-memory-wall-with-venado-supercomputer/

[3] https://www.theregister.com/2022/11/14/frontier_top500_aurora/

[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_onprem/systems&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2Zh34Uh@phGuNz-et-5Ym5wAAANY&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0

[5] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_onprem/systems&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44Zh34Uh@phGuNz-et-5Ym5wAAANY&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[6] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_onprem/systems&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33Zh34Uh@phGuNz-et-5Ym5wAAANY&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[7] https://www.theregister.com/2023/09/02/los_alamos_supercomputer/

[8] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_onprem/systems&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44Zh34Uh@phGuNz-et-5Ym5wAAANY&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[9] https://discover.lanl.gov/news/0415-venado/

[10] https://www.theregister.com/2024/04/10/meta_mtia_chip/

[11] https://www.theregister.com/2024/04/09/intel_gaudi_ai_accelerator/

[12] https://www.theregister.com/2024/04/11/genai_amazon_internet/

[13] https://www.theregister.com/2024/03/05/amd_hires_former_oak_ridge_chief/

[14] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_onprem/systems&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33Zh34Uh@phGuNz-et-5Ym5wAAANY&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[15] https://www.theregister.com/2023/08/09/nvidia_gracehopper_hbm3e/

[16] https://www.nextplatform.com/2023/11/28/hpc-pioneers-pave-the-way-for-a-flood-of-arm-supercomputers/

[17] https://www.theregister.com/2023/11/01/uk_isambard_supercomputer/

[18] https://www.nextplatform.com/2023/10/05/details-emerge-on-europes-first-exascale-supercomputer/

[19] https://whitepapers.theregister.com/



One promising concept that I came up with right away was that you could
manufacture personal air bags, then get a law passed requiring that they be
installed on congressmen to keep them from taking trips. Let's say your
congressman was trying to travel to Paris to do a fact-finding study on how
the French government handles diseases transmitted by sherbet. Just when he
got to the plane, his mandatory air bag, strapped around his waist, would
inflate -- FWWAAAAAAPPPP -- thus rendering him too large to fit through the
plane door. It could also be rigged to inflate whenever the congressman
proposed a law. ("Mr. Speaker, people ask me, why should October be
designated as Cuticle Inspection Month? And I answer that FWWAAAAAAPPPP.")
This would save millions of dollars, so I have no doubt that the public
would violently support a law requiring airbags on congressmen. The problem
is that your potential market is very small: there are only around 500
members of Congress, and some of them, such as House Speaker "Tip" O'Neil,
are already too large to fit on normal aircraft.
-- Dave Barry, "'Mister Mediocre' Restaurants"