Apple Working To Cram Massive Gemini Model Into iPhone To Power New Siri (arstechnica.com)

(Saturday May 30, 2026 @04:00AM (BeauHD) from the privacy-vs-convenience dept.)

Apple is reportedly [1]working to shrink Google's Gemini models enough to power parts of a long-delayed AI-enhanced Siri on iPhones. But despite Apple's best efforts to run the AI locally, "the iPhone's Gemini makeover will lean heavily on Google and Nvidia in the cloud," reports Ars Technica. That could complicate Apple's privacy-first AI messaging, especially if more complex Siri requests are routed through Google infrastructure and Nvidia's encrypted cloud-computing platform. Ars Technica reports:

> After [2]inking the Google deal , Apple apparently got to work distilling Google's giant cloud-based Gemini models. Distillation is a process in which a small, less resource-intensive model learns to mimic a large, expensive one. With enough time, this can reliably transfer useful capabilities while pruning less important weights from the model. That may enable Siri to handle some tasks with private local compute, but a cloud component looks inevitable.

>

> Processing users' AI data in the cloud could be a problem for Apple. At WWDC, the company will probably promote its years of experience designing chips and how well that positions it for AI. However, The Information [3]claims that Apple has struggled to even get Google's massive undistilled Gemini models running on its custom Private Cloud Compute infrastructure, which is built on on M-series Mac chips.

>

> When the smarter Siri rolls out, it will probably route more complex tasks to Google's cloud infrastructure instead of Apple's, but it won't be running on Google TPUs. Apple has reportedly signed a deal with Nvidia to use its Confidential Computing platform for this purpose. Confidential Computing keeps data encrypted on Nvidia GPUs while it's being processed in the cloud, which could help Apple claim it's still sensitive to user privacy concerns. It might even retain its own Private Cloud Compute branding for the system.

>

> The iPhone probably won't tell you which version of Gemini is handling individual Siri requests. Device makers designing hybrid systems that rely on local and cloud-based AI like to talk about making the experience feel "seamless." There might be clues, though.

[1] https://arstechnica.com/ai/2026/05/apple-reportedly-trying-to-distill-googles-multi-trillion-parameter-gemini-ai-to-run-on-iphone/

[2] https://apple.slashdot.org/story/26/01/12/166200/apple-partners-with-google-on-siri-upgrade-declares-gemini-most-capable-foundation

[3] https://www.theinformation.com/articles/apple-renew-push-ai-runs-devices-instead-cloud

In 2028? (Score:2)

by ChunderDownunder ( 709234 )

That's when RAM shortages are supposed to subside.

If you're currently selling netbooks with only 8 Gig, how much RAM will a Gemini iPhone realistically require?

Re: (Score:2)

by martin-boundary ( 547041 )

Depends. Current leading LLM technologies are super bloated and inefficient. When 2028 arrives the resource requirements for a natural language user interface bot may be a lot less. The software side requirements can certainly be finalized a few months before launch. On the other hand, hardware designs have a long lead time and must be locked in much sooner. On the other hand, data can be shipped to a beefy server and processed in the cloud, so the hardware doesn't really need a lot of resources.

Re: (Score:2)

by thegarbz ( 1787294 )

> how much RAM will a Gemini iPhone realistically require

Depends on what you want to do with it. If you want to generate realistic using images doing whatever you want locally, while answering every question in the universe I suggest you get a phone with at least 96GB of RAM. If on the other hand you are running small local models that do specific tasks and offload the rest to an internet search, you can run that AI model on a iPhone 3GS if so desired.

Not every AI is the same.

Battery empty ... (Score:1)

by angel'o'sphere ( 80593 )

... phone dead.

Siri, I need help!

"Nvidia's encrypted" what? (Score:1)

by Kartu ( 1490911 )

Since when does a hardware manufacturer own the servers? Or which particular Nvidia model is Apple interested in running on? Or why would google need anything, but own TPU chips that nowadays can do training, let alone, inference? I thought "get paid for mentioning nvidia" was a conspiracy theory, but here we go again...

Don't want an AI iPhone...... (Score:2)

by bsdetector101 ( 6345122 )

Plus don't use Siri much. It's bad enough now when you do a Google search and it has a small disclaimer that results may not be accurate !!!!

News: 0183439174

Apple Working To Cram Massive Gemini Model Into iPhone To Power New Siri (arstechnica.com)

In 2028? (Score:2)

Re: (Score:2)

Re: (Score:2)

Battery empty ... (Score:1)

"Nvidia's encrypted" what? (Score:1)

Don't want an AI iPhone...... (Score:2)