How Amazon red-teamed Alexa+ to keep your kids from ordering 50 pizzas
- Reference: 1746119348
- News link: https://www.theregister.co.uk/2025/05/01/amazon_red_teamed_alexaplus_interview/
- Source link:
This is why the e-commerce giant brought in security engineers, including both red teams and penetration testers, to work alongside product developers from the beginning, according to Amazon CISO Amy Herzog.
Their job was to anticipate what could go wrong and ensure safety and security guardrails were in place to prevent Alexa+ from jumping the track.
[1]
"It's funny how, having been in both seats, the product engineer thinks about making the intended thing work, and the security engineer thinks about all the ways that you can game that system," Herzog told The Register on the outskirts of the [2]RSA Conference in San Francisco this week.
[3]
[4]
"Whenever you're talking about a system that can take actions on behalf of someone our immediate [reaction is]: Wouldn't it be good if, like me, as someone who's running this household could just say, This is what I need to go shopping for. These are the dinner reservations I need to make. Go make that happen. Schedule a delivery window," she said.
"And then my kid comes into the kitchen and says, and also 50 pepperoni pizzas for me and my friends."
The product engineer thinks about making the intended thing work, and the security engineer thinks about all the ways that you can game that system
While the developers tend to focus on the product's potential — this is what they could build, and here are all the amazing, new things it could do — it's the red team's job to burst that bubble, or at least point out what could go wrong to ensure that systems are isolated and security mechanisms are put in place to prevent unintended or malicious consequences.
"So having both of those in the same design meeting is really beneficial, because then even the product engineer is like, 'Oh yeah, you're right. I would totally order pizzas.' How do we make sure the system can handle that kind of thing?"
[5]
Herzog is one of four chief information security officers (CISOs) across the e-commerce giant, and she's responsible for infosec across ads and devices — so Alexa and the next-gen Alexa+ fall under her purview.
The personal assistant, available only to [6]Amazon-approved early testers at this point, is built on top of Amazon's LLMs, and the company claims it can orchestrate actions across tens of thousands of services and devices to do things like control smart-home products, order groceries, play music, and make dinner reservations while remembering friends and family members' dietary restrictions and restaurant preferences.
Amazon says it can also interact with third-party [7]AI agents on behalf of users. In an example used by Herzog, this means that if your oven breaks, Alexa+ can go online, use a system like Thumbtack to find a repairperson in your area, schedule the repair, and then tell you when it's fixed.
[8]
"And this kind of a product has, as you might imagine, a number of unique security considerations," Herzog said. "All the same attacks are still there. In that sense, not much has changed. But since these things are non-deterministic, you have to really build in ways to understand its behavior in a different way than if you're working with a deterministic system."
In this case, non-deterministic means you can give Alexa+ (or any other AI assistant) the exact same input, or query, and it will spit out a slightly different output every time.
[9]Amazon to kill off local Alexa processing, all voice requests shipped to the cloud
[10]'Error' causes Alexa to endorse Kamala Harris, refuse to discuss Trump
[11]GenAI will be bigger than the cloud or the internet, Amazon CEO hopes
[12]Speech now streaming from brains in real-time
This can lead to [13]prompt injection , where mischief-makers or attackers create malicious inputs to trick the LLM into overriding its safety guardrails and doing things it is not supposed to do, or even just combining a series of inputs in a manner that causes the LLM to behave in unintended ways.
Plus, Alexa+ talks to a lot of other apps and services, and that requires a ton of interaction with APIs to send and retrieve data, execute commands, and perform other actions to do the things that people want an AI assistant to do.
"We did a lot of testing on pathways between those APIs," Herzog said, adding that the API pathways to turn your house lights on and off are different from those routes required to text your kid or make a dinner reservation. "Which ones should be grouped with which other ones? How do we expect different actions to be taken together?"
This approach of bringing the offensive security teams together with product developers from the start of the process is "somewhat unusual," Herzog noted. "Usually, my red teamers like a system to be done before we let them loose on it." ®
Get our [14]Tech Resources
[1] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_specialfeatures/spotlightonrsac&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2aBPu-wBpX0ATvI-CtBmE-gAAAMQ&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0
[2] https://www.theregister.com/special_features/spotlight_on_rsac/
[3] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_specialfeatures/spotlightonrsac&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aBPu-wBpX0ATvI-CtBmE-gAAAMQ&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_specialfeatures/spotlightonrsac&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aBPu-wBpX0ATvI-CtBmE-gAAAMQ&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[5] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_specialfeatures/spotlightonrsac&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aBPu-wBpX0ATvI-CtBmE-gAAAMQ&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[6] https://www.amazon.com/dp/B0DCCNHWV5?ref=ods_surl_xaa_us
[7] https://www.theregister.com/2025/04/23/agentic_ai_rsac/
[8] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_specialfeatures/spotlightonrsac&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aBPu-wBpX0ATvI-CtBmE-gAAAMQ&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[9] https://www.theregister.com/2025/03/17/amazon_kills_on_device_alexa/
[10] https://www.theregister.com/2024/09/04/error_amazon_alexa_k/
[11] https://www.theregister.com/2024/04/11/genai_amazon_internet/
[12] https://www.theregister.com/2025/04/02/speech_now_streaming_from_brains/
[13] https://www.theregister.com/2024/08/13/who_uses_llm_prompt_injection/
[14] https://whitepapers.theregister.com/
A 100% way
A 100% way to keep the kids from ordering pizza; We don't have an Alexa. Never shall.
Re: A 100% way
Elegant solutions are always elegant.
When the 100% o2 atmosphere caught fire on Apollo 1, killing three astronauts, NASA termed it a failure of imagination.
No Red Team can conjure up what a three year old thinks. Just because it can be automated, does not mean it should be. (I'm looking at you Tesla.)
However, some people think that bugging their own home is a good idea, so it's swings and roundabouts really. (Not really, it's a horrifically bad idea, but just trying to placate the woefully ignorant.)
Interesting
So I wonder how they bar ordering 50 pizzas but still let you order pizza... OR have a party. Yes, I've had parties where we ordered 30+ pizzas, so it's an actual possibility.
Right now, my credit union fraud detection shits itself when (for example) I change my Patreon subscriptions, or my salary changes because of tax/overtime/bonus/insurance/etc reasons. It's gotten really frequent and annoying.
Re: Interesting
It could say "hold on, if you want to order 50 pzzas I'm going to need more confirmation" and maybe it has to give you your credit card number used to pay for the Amazon subscription or something else your kid (hopefully) doesn't have handy.
Sort of like when fraud detection kicks in on your credit card and you need to do something to authorize the charge it has put on hold. I would think it would be better to have overly active fraud detection than not enough - ESPECIALLY if you are talking about your actual bank account (which I assume you are since you mention a credit union) where fraudulent transactions will create 10x more potential issues than on a credit card.
Personally I don't use a debit card for ANYTHING, and avoid direct payment out of my bank account as much as possible.
Re: Interesting
My local bank deactivated my card for "suspicious activity." I'd tried to buy four movie tickets to see "Paddington", so understandable, right? I'd frequented that cinema for almost a decade and luckily I had cash on me. Had to call the bank the next day to get the lock lifted. Three weeks later, the same bank allowed a $2,500+ charge on my card for a SmartTV in a city three thousand miles away. No calls, emails, texts or warnings; until the wife saw the statement.
Yeah but...
What if someone tries to order [1]2 tons of creamed corn ?
[1] https://xkcd.com/1807/
True transcript
Visited some friends who had an Alexa:
[Me] Alexa! Tell me a fart joke.
[Alexa] The fart skill is not enabled by default. Would you like to enable it?
[Me] YES! YES! YES!
[Alexa] [tells fart jokes]
After a few glasses of wine:
[Me] Alexa! What is a schlong?
[Alexa] Schlong is a river in Italy.
[Me] Err, I they she got that one wrong...
What am I missing here?
If you have a family and you have enough credit to be able to afford 50 pizzas (as an example) maybe you should put better security on your damn cards.
Just a thought.
Four sides
"the product engineer thinks about making the intended thing work, and the security engineer thinks about all the ways that you can game that system"
The CEO "how much can I ignore the security team so I can sell more shit and make number go up?"
The intelligent home user "Why the fuck would I want this spyware in my house?"
hate to praise Amazon
So I won't but
Go, red team!
Apply logic.
AI Is so over hyped and doesn't really exist as I'm sure this place knows
So a company actually looking to make a home use of it more than just hype = I ain't bowing to any overlords just yet but I won't insult Amazon over this
Not when there's so many other things.
But yeah go red team.