News: 1743532578

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Nvidia’s AI suite may get a whole lot pricier, thanks to Jensen’s GPU math mistake

(2025/04/01)


Comment At its GPU Technology Conference last month, Nvidia broke with convention by shifting its definition of what counts as a GPU.

"One of the things I made a mistake on: Blackwell is really two GPUs in one Blackwell chip," CEO Jensen Huang explained on stage [1]at GTC . "We called that one chip a GPU and that was wrong. The reason for that is it screws up all the NVLink nomenclature."

However, Nvidia's shift to counting GPU dies, rather than SXM modules, as individual GPUs doesn't just simplify NVLink model numbers and naming conventions. It could also double the number of AI Enterprise licenses Nvidia can charge for.

[2]

Nvidia's AI Enterprise suite, which covers a host of AI frameworks including access to its inference microservices ( [3]NIMs ), would run you $4,500 a year or $1 an hour in the cloud, per GPU. This meant an Nvidia HGX B200 with eight modules (one Blackwell GPU per module) would cost $36,000 a year or $8 per hour in the cloud.

[4]

[5]

But with the new HGX [6]B300 NVL16 , Nvidia is now counting each die as a GPU. And since the system also has eight modules, each with two dies, that brings the total to 16 GPUs. That means, assuming no changes to Nvidia's AI Enterprise subscription pricing, its latest HGX boxes will set you back twice as much.

The change in naming convention is a departure from last year's [7]Blackwell systems . In our Blackwell launch coverage, Nvidia took issue with us calling Blackwell a "chiplet" architecture – multiple separate dies or chiplets linked within one processor package – arguing that it's actually a "two-reticle limited die architecture that acts as a unified, single GPU."

[8]

It's not like the latest B300 GPUs are that much more powerful, either, over last year's B200. As a quick refresher, the HGX B300 offers about 1.5x more memory capacity at 2.3TB versus 1.5TB on the B200, while 4-bit floating point (FP4) perf is up roughly 50 percent to just over 105 dense petaFLOPS per system. However, the performance jump is only for workloads that can take advantage of that FP4 performance. At higher precision, the B300 doesn't offer any floating point advantage over the older system.

Confusingly, this change only applies to Nvidia's air-cooled B300 boxes and not the more powerful GB300 NVL72 systems, which continue to count the packages as GPUs.

So what gives? Well, according to Nvidia's VP and GM of Hyperscale and HPC, Ian Buck, there is a technical reason for this.

[9]

The main difference is that the B300 package offered on the HGX chassis lacks the chip-to-chip interconnect found on previous-gen Blackwell accelerators. This means the two chips really are two distinct 144GB GPUs sharing a common package. Buck explained this allowed Nvidia to achieve better power and thermals. This does come with some disadvantages. Because there's no C2C interconnect between the two, if one die wants to access the memory on the other, it has to go off-package, over the NVLink switch, and then take a U-turn.

The GB300, on the other hand, retains the C2C interface, avoiding the off-package memory detour. Because the two dies can directly communicate and share memory, they're treated as a single, unified GPU - at least as far as Nvidia's software and licensing are concerned.

[10]CoreWeave cools its jets, downsizing IPO as investor heat fades

[11]Microsoft walking away from datacenter leases (probably) isn't a sign the AI bubble is bursting

[12]A closer look at Dynamo, Nvidia's 'operating system' for AI inference

[13]Nvidia's Vera Rubin CPU, GPU roadmap charts course for hot-hot-hot 600 kW racks

This technical exception won't last long, however, with the launch of Nvidia's [14]Vera Rubin superchips, which will embrace the B300-style naming convention and start counting individual dies as GPUs, hence the NVL144 designation.

This is also how Nvidia's Vera Rubin Ultra platform, coming in late 2027, can [15]claim 576 GPUs per rack. As we previously explored, it's really just 144 modules — what prior to Blackwell Ultra we would have considered a GPU — with four dies per module.

If we had to guess, we'd wager in the year since Nvidia unveiled Blackwell, the GPU giant realized it was leaving subscription software revenues on the table. We say it looks like because when we asked Nvidia how the naming change would impact AI Enterprise licensing, they told us pricing details hadn't been finalized yet.

"Pricing details are still being finalized for B300 and no details to share on Rubin beyond what was shown in the GTC keynote at this time," a spokesperson, who clarified that this also included AI Enterprise pricing, told El Reg . ®

Get our [16]Tech Resources



[1] https://www.theregister.com/special_features/nvidia_gtc/

[2] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2Z-xh9-vH73AXWV_L7pUqJAAAARc&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0

[3] https://developer.nvidia.com/nim

[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44Z-xh9-vH73AXWV_L7pUqJAAAARc&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[5] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33Z-xh9-vH73AXWV_L7pUqJAAAARc&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[6] https://www.theregister.com/2025/03/18/nvidia_blackwell_ultra/

[7] https://www.theregister.com/2024/03/18/nvidia_turns_up_the_ai/

[8] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44Z-xh9-vH73AXWV_L7pUqJAAAARc&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[9] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_software/aiml&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33Z-xh9-vH73AXWV_L7pUqJAAAARc&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[10] https://www.theregister.com/2025/03/28/coreweave_downsizes_ipo/

[11] https://www.theregister.com/2025/03/26/microsoft_ai_apocalypse/

[12] https://www.theregister.com/2025/03/23/nvidia_dynamo/

[13] https://www.theregister.com/2025/03/19/nvidia_charts_course_for_600kw/

[14] https://www.theregister.com/2025/03/19/nvidia_charts_course_for_600kw/

[15] https://www.nextplatform.com/2025/03/19/nvidia-draws-gpu-system-roadmap-out-to-2028/

[16] https://whitepapers.theregister.com/



Milk the monopoly while it lasts

Anonymous Coward

Yet another example of a monopolist fleecing the market while the monopoly and the Free VC Money Bubble last.

Most countries, US included, having a considerable backlog in grid infrastructure. Even if you can produce the power to feed your data center, there are no cables to get it to your data center. And all the power you feed into a data center has to be led out again as heat with power and water hungry cooling. And the data centers are not where the water to cool them is found.

And then there is this matter of getting the data cables with the required bandwidth in place. Relatively minor compared to the power needed. But this combines when everyone wants to be located at the same spot, like North Virginia.

And there is NVidia, churning out even more GPUs that need yet power. Because, what else can they do? They will undoubtedly sell everything they can churn out. And some (a lot?) will end up in places they should not, but that pay well. But it looks like more data centers are being planned than can actually be powered. Or connected.

Someone will find a way to do the computations with less watt per bit-operation-transfer and that entity will break the monopoly.

Re: Milk the monopoly while it lasts

Anonymous Coward

>>"Yet another example of a monopolist fleecing the market"

NVidia is not a monopoly. Everyone is free to buy similar cards from other manufacturers.

In market economy the supply and demand sets the prices. On the other side of spectrum the government sets the prices and production regardless of demand. I've seen the latter at work, my vote goes for the former.

>>"while the monopoly and the Free VC Money Bubble last"

Venture Capitalists take their risks, and if they lose money in the venture - I'm not shedding any tears for them.

>>"Someone will find a way to do the computations with less watt per bit-operation-transfer and that entity will break the monopoly"

Yes, Nvidia and all their competition are constantly looking at optimisation and process node miniaturisation and performance/watt is getting better all the time.

Re: Everyone is free to buy similar cards from other manufacturers.

Anonymous Coward

But no one makes similar cards.

Look up the definition of Monopoly:

A monopoly is a market where one business acts as the only supplier of a good or service.

A monopoly is a market structure with a single seller or producer that dominates an industry or sector.

A monopoly is a market situation where a single seller controls the supply of a good or service, with little or no competition.

A monopoly implies an exclusive possession of a market by a supplier of a product for which there is no substitute.

A monopoly exists when a specific person or enterprise is the only supplier of a particular good.

etc.

No one else produces compute gear like NVidea.

Re: Everyone is free to buy similar cards from other manufacturers.

DS999

But no one makes similar cards

Yes they do. No one makes cards that PERFORM AS WELL, but selling the best available alternative doesn't make you a monopoly. It means your competition isn't as good at making products as you are.

If we use your definition that the seller of the best mousetrap has a "monopoly", regardless of how many other not as good mousetraps you can buy.

Re: Everyone is free to buy similar cards from other manufacturers.

Anonymous Coward

CUDA

Re: Everyone is free to buy similar cards from other manufacturers.

Anonymous Coward

>>"But no one makes similar cards."

But they do.

Instinct MI325X is AMD's halo product with 256GB HBM3 memory and 1.3 PetaFLOPS of FP16 speed. It's in the same league. (also in price)

https://www.theregister.com/2024/10/10/amd_mi325x_ai_gpu/

>>"Look up the definition of Monopoly:"

Gee, thanks AC. Nvidia isn't a monopoly. AMD Instinct products are just as available as Nvidia GPUs. The No.1 in TOP500 list has those AMD Instinct units for GPGPU operations. Not Nvidia.

Nvidia rules the market because they were awake when people found out the raw power of their gaming cards could be used for more than just gaming. Nvidia started catering to the audience by creating GPGPU cards and CUDA. That the competition was asleep at the wheel or that people are willing to pay their prices IS NOT Nvidia's fault at all.

So Nvidia is...

IGotOut

fucking over the AI operators. Oh dear, what a shame.

Re: So Nvidia is...

Mentat74

Oh dear, how sad, never mind...

Re: So Nvidia is...

rgjnk

They fuck over everyone.

Especially anyone who needs to buy their exorbitantly priced top end hardware and then gets another dry poke when they have to add the software licenses to actually use it.

And it's not like the consumer end is getting cheaper or more flexible with time...

Re: So Nvidia is...

IGotOut

AMD have fired the warning shots with their latest consumer cards, and even the latest Intel cards are looking pretty good for the mid-tier.

The Correct Definition

An_Old_Dog

... is extremely tecnical and complicated.

The short version is, "It's the definition which makes nVIDIA the most money."

Anti-Sabbatical:
A job taken with the sole intention of staying only for a
limited period of time (often one year). The intention is usually to
raise enough funds to partake in another, more meaningful activity
such as watercolor sketching in Crete, or designing computer knit
sweaters in Hong Kong. Employers are rarely informed of intentions.
-- Douglas Coupland, "Generation X: Tales for an Accelerated
Culture"