News: 0183189912

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

CUDA Proves Nvidia Is a Software Company (wired.com)

(Monday May 11, 2026 @11:30PM (BeauHD) from the Apple-like-moat dept.)


Nvidia's real AI moat isn't "a piece of hardware," writes Wired's Sheon Han. It's CUDA: a mature, deeply optimized software ecosystem that [1]keeps machine-learning workloads tied to Nvidia GPUs . An anonymous reader quotes a report from Wired:

> What sounds like a chemical compound banned by the FDA may be the one true moat in AI. CUDA technically stands for Compute Unified Device Architecture, but much like laser or scuba, no one bothers to expand the acronym; we just say "KOO-duh." So what is this all-important treasure good for? If forced to give a one-word answer: parallelization. Here's a simple example. Let's say we task a machine with filling out a 9x9 multiplication table. Using a computer with a single core, all 81 operations are executed dutifully one by one. But a GPU with nine cores can assign tasks so that each core takes a different column -- one from 1x1 to 1x9, another from 2x1 to 2x9, and so on -- for a ninefold speed gain. Modern GPUs can be even cleverer. For example, if programmed to recognize commutativity -- 7x9 = 9x7 -- they can avoid duplicate work, reducing 81 operations to 45, nearly halving the workload. When a single training run costs a hundred million dollars, every optimization counts.

>

> Nvidia's GPUs were originally built to render graphics for video games. In the early 2000s, a Stanford PhD student named Ian Buck, who first got into GPUs as a gamer, realized their architecture could be repurposed for general high-performance computing. He created a programming language called Brook, was hired by Nvidia, and, with John Nickolls, led the development of CUDA. If AI ushers in the age of a permanent white-collar underclass and autonomous weapons, just know that it would all be because someone somewhere playing Doom thought a demon's scrotum should jiggle at 60 frames per second. CUDA is not a programming language in itself but a "platform." I use that weasel word because, not unlike how The New York Times is a newspaper that's also a gaming company, CUDA has, over the years, become a nested bundle of software libraries for AI. Each function shaves nanoseconds off single mathematical operations -- added up, they make GPUs, in industry parlance, go brrr.

>

> A modern graphics card is not just a circuit board crammed with chips and memory and fans. It's an elaborate confection of cache hierarchies and specialized units called "tensor cores" and "streaming multiprocessors." In that sense, what chip companies sell is like a professional kitchen, and more cores are akin to more grilling stations. But even a kitchen with 30 grilling stations won't run any faster without a capable head chef deftly assigning tasks -- as CUDA does for GPU cores. To extend the metaphor, hand-tuned CUDA libraries optimized for one matrix operation are the equivalent of kitchen tools designed for a single job and nothing more -- a cherry pitter, a shrimp deveiner -- which are indulgences for home cooks but not if you have 10,000 shrimp guts to yank out. Which brings us back to DeepSeek. Its engineers went below this already deep layer of abstraction to work directly in PTX, a kind of assembly language for Nvidia GPUs. Let's say the task is peeling garlic. An unoptimized GPU would go: "Peel the skin with your fingernails." CUDA can instruct: "Smash the clove with the flat of a knife." PTX lets you dictate every sub-instruction: "Lift the blade 2.35 inches above the cutting board, make it parallel to the clove's equator, and strike downward with your palm at a force of 36.2 newtons."

"You can begin to see why CUDA is so valuable to Nvidia -- and so hard for anyone else to touch," writes Han. "Tuning GPU performance is a gnarly problem. You can't just conscript some tender-footed undergrad on Market Street, hand them a Claude Max plan, and expect them to hack GPU kernels. Writing at this level is a grindsome enterprise -- unless you're a cracker-jack programmer at DeepSeek..."

Han goes on to argue that rivals like AMD and Intel offer competitive specs on paper, but their software stacks have struggled with bugs, compatibility issues, and weak adoption. As a result, Nvidia has built an Apple-like moat around AI computing, leaving the industry dependent on its expensive hardware.



[1] https://www.wired.com/story/cuda-proves-nvidia-is-a-software-company/



AI could solve this eventually. (Score:4, Funny)

by Luckyo ( 1726890 )

AI could solve this by bypassing this moat to enable translation to openCL.

Considering just how good AI is at this sort of work once properly trained, I would be surprised if this doesn't happen. Though Nvidia will certainly fight anyone trying to do this to slow it down.

Re: (Score:1)

by TheMiddleRoad ( 1153113 )

Funniest post of the year!

Well "just" vibe code you a new API, then eh? (Score:4, Funny)

by MIPSPro ( 10156657 )

If it's so super-awesome and mind blowing, then just use the current crop of AI to design the next crop and create an open source API or at least something better. What? That's challenging you say? Bah! Nothing is too challenging for AI! Anthropic told me so!

Re: Well "just" vibe code you a new API, then eh? (Score:2)

by alcmena ( 312085 )

I made the same comment about Googleâ(TM)s SDK. If AI is so awesome, why not just write a single SDK in a single language, and AI build the others on each push? Then devs can use their preferred language and it already has full first party support. Seems so simple⦠and the fact that it isnâ(TM)t being done screams pretty loudly.

Re: (Score:1)

by Orly0101 ( 6546268 )

I know everyone makes the same joke regarding AI not being able to do it. And whoever works at the edge of any field knows that the sort of large projects required to be implemented cannot simply be asked a swarm of agents to code, some things are, and will stay out of reach for a while, maybe forever.

But, could any generous soul explain some specifics about the sort of things that make it that hard? What are the challenges there? Is it the amount of code required to be translated? The secrecy of the

Re: (Score:2)

by Pseudonymous Powers ( 4097097 )

Back in the old days, when Nvidia was just a company that built graphics cards for carding graphics with, I always wanted them to admit to themselves that they were a hardware company, who sold hardware, which is where their money comes from, so they wouldn't feel the need to sue people trying to write drivers for their cards. So this reinforcement of their self-identification as a software company isn't particularly welcome to me. But c'est la something or other, as the French say.

Re: (Score:2)

by Frank Burly ( 4247955 )

I recall that even then their software was supposed to provide an edge over AMD's allegedly superior hardware (with shit drivers). Useful AI should eventually negate the CUDA advantage, but it will take more than AI to make 1.6 nm chips.

Re: (Score:2)

by bill_mcgonigle ( 4333 ) *

When CUDA started taking off we had ATI hardware, to support their open source pledge, and looked into ROCm.

Just getting the drivers to build on EL-anything was an extreme effort, and it wasn't my first rodeo.

Without betraying confidences, I was told second-hand that there were only ten people on the GPU driver team across all platforms and that they were doing their best and not sleeping enough as it was, with Compute way behind gaming bugs on the priority list.

I couldn't independently verify of course but

Re: (Score:1)

by TheMiddleRoad ( 1153113 )

In the late 90s, people already talked about the great advantage NVIDIA had in software. In the late 90s, NVIDIA already talked about making their own computers, not just GPUs.

Not paying attention? (Score:2)

by Pinky's Brain ( 1158667 )

NVIDIA bought Groq, which is never going to run CUDA well, and Anthropic ported to Google's TPU.

Unless the software stack is a complete disaster OpenAI and Anthropic make do, architecture rules. NVIDIA leads in architecture, no competition for NVLINK and C2C in deployment for instance and Groq in its niche only competes with Cerebras.

Small players are more dependent on open source and more easily manipulated by lazy devs, but the biggest spenders don't give a shit about CUDA.

Re: (Score:1)

by TheMiddleRoad ( 1153113 )

So you're saying that OpenAI, Anthropic, Facebook, Google, and Grok are not running a lot on NVIDIA hardware? Tell me more about your fantasies.

Re: (Score:1)

by TheMiddleRoad ( 1153113 )

And yes, I know the difference between Grok and Groq.

Re: (Score:2)

by Pinky's Brain ( 1158667 )

"NVIDIA leads in architecture"

Re: (Score:1)

by TheMiddleRoad ( 1153113 )

"the biggest spenders don't give a shit about CUDA" The biggest spenders use CUDA on NVIDIA hardware.

Re: (Score:2)

by Pinky's Brain ( 1158667 )

Yes.

Not really a new idea - hardware-wise anyway. (Score:2)

by fahrbot-bot ( 874524 )

> But a GPU with nine cores ...

Or any number of, now obsolete, general-purpose, vector-processor systems, like the Cray 2 or even parallel systems like the Myrias Parallel System - both of which I was an SA on *way* back. Parallel operations can speed certain type of workflow.

[1]Vector supercomputers [wikipedia.org]

[2]Vector processor [wikipedia.org]

[1] https://en.wikipedia.org/wiki/Category:Vector_supercomputers

[2] https://en.wikipedia.org/wiki/Vector_processor

Cooperate or Die (Score:2)

by Tablizer ( 95088 )

> rivals like AMD and Intel offer competitive specs on paper, but their software stacks have struggled with bugs, compatibility issues, and weak adoption. As a result, Nvidia has built an Apple-like moat around AI computing, leaving the industry dependent on its expensive hardware.

Nvidia's competitors need to work together to improve open-source software tooling and to standardize hardware interfaces, or else go the way of Commodore and Tandy.

Subtle Roasting (Score:2)

by Himmy32 ( 650060 )

> added up, they make GPUs, in industry parlance, go brrr.

The subtle roast of implying that the industry is a bunch of Gen Alpha middle schoolers. Are the graphics also skibidi?

Seems like a smart CEO (Score:2)

by ndsurvivor ( 891239 )

I guess the CEO of NVidia played the long game on AI. They were nothing back in 2012 when they were just a cheap graphics acceleration chip company, and now they bypassed Microsoft in market capital. They don't seem "evil" to me, it seems like a thoughtful company that worked hard, took a long view, and reaped the rewards. I simply hope that they don't get the billionaire bug and becomes evil.

Agreed (Score:2)

by JBMcB ( 73720 )

They spent a lot of time and money making sure CUDA worked right. For a while AMD's compute API wasn't backwards *or* forwards compatible. You had to do some rewriting and a recompile every time a new API was released.

Intel has gone through three completely different, and mostly incompatible, hardware stacks. Remember Phi? Altera? Now it's AVX for some compute tasks, and Xe for other tasks.

Do the math, cowards! (Score:1)

by TheMiddleRoad ( 1153113 )

Octomom made 8 babies in 7 months. (8*9)/(7/9) = A production rate of ~92 babies for 9 women over 9 months.

NVIDIA and ASUS Partnership (Score:2)

by IdanceNmyCar ( 7335658 )

I always hate how people often take success in isolation. A lot of the success of NVIDIA I think comes from its original strong partnership with ASUS which is a hardware manufacturing company. NVIDIA originally did the chip design and at that level it's kind of hard to ignore the software, especially on the driver front. This means they always had a "low-level" team understanding software issues. Then when it came to really building out a commodity GPU, they worked with ASUS.

For years, I have been a huge fa

This thing all things devours:
Birds, beasts, trees, flowers;
Gnaws iron, bites steel;
Grinds hard stones to meal;
Slays king, ruins town,
And beats high mountain down.