AWS admits more bits of its cloud broke as it recovered from DynamoDB debacle

(2025/10/21)

Reference: 1761024808
News link: https://www.theregister.co.uk/2025/10/21/aws_outage_update/
Source link:

Amazon Web Services has revealed that its efforts to recover from the massive mess at its US-EAST-1 region caused other services to fail.

The most recent update to the cloud giant’s [1]service health page opens by recounting how a [2]DNS mess meant services could not reach a DynamoDB API, which led to [3]widespread outages .

Today is when the Amazon brain drain finally sent AWS down the spout [4]READ MORE

AWS got that sorted at 02:24 AM PDT on October 20th.

But then things went pear-shaped in other ways.

“After resolving the DynamoDB DNS issue, services began recovering but we had a subsequent impairment in the internal subsystem of EC2 that is responsible for launching EC2 instances due to its dependency on DynamoDB,” the status page explains. Not being able to launch EC2 instances meant Amazon’s foundational rent-a-server offering was degraded, a significant issue because many users rely on the ability to automatically create servers as and when needed.

[5]

While Amazonian engineers tried to get EC2 working properly again, “Network Load Balancer health checks also became impaired, resulting in network connectivity issues in multiple services such as Lambda, DynamoDB, and CloudWatch.”

[6]Vodafone keels over, cutting off millions of mobile and broadband customers

[7]Microsoft 364 trips over its own network settings in North America

[8]Kubernetes kicks down Azure Front Door

[9]EU’s cyber agency blames ransomware as Euro airport check-in chaos continues

AWS recovered Network Load Balancer health checks at 9:38 AM, but “temporarily throttled some operations such as EC2 instance launches, processing of SQS queues via Lambda Event Source Mappings, and asynchronous Lambda invocations.”

The cloud colossus said it throttled those services to help with its recovery efforts which, The Register expects, means it decided not to allow every request for resources because a flood of jobs would have overwhelmed its systems.

[10]

[11]

“Over time we reduced throttling of operations and worked in parallel to resolve network connectivity issues until the services fully recovered,” the post states.

By 3:01 PM, all AWS services returned to normal operations, meaning problems persisted for over a dozen hours after resolution of the DynamoDB debacle.

[12]

AWS also warned that the incident is not completely over, as “Some services such as AWS Config, Redshift, and Connect continue to have a backlog of messages that they will finish processing over the next few hours.”

The post ends with a promise to “share a detailed AWS post-event summary.”

Grab some popcorn. Unless you have an internet-connected popcorn machine, which recent history tells us may be one of a horrifyingly large number of devices that stops working when major clouds go down. ®

Get our [13]Tech Resources

[1] https://health.aws.amazon.com/health/status

[2] https://www.theregister.com/2025/10/20/amazon_aws_outage/

[3] https://www.theregister.com/2025/10/20/aws_outage_chaos/

[4] https://www.theregister.com/2025/10/20/aws_outage_amazon_brain_drain_corey_quinn/

[5] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offprem/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2aPdZtFAWuvH05cYVHnQ3DwAAAgw&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0

[6] https://www.theregister.com/2025/10/13/vodafone_outage/

[7] https://www.theregister.com/2025/10/10/microsoft_365_na_outage/

[8] https://www.theregister.com/2025/10/09/kubernetes_azure_outage/

[9] https://www.theregister.com/2025/09/22/eus_cyber_agency_confirms_ransomware/

[10] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offprem/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aPdZtFAWuvH05cYVHnQ3DwAAAgw&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[11] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offprem/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aPdZtFAWuvH05cYVHnQ3DwAAAgw&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[12] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offprem/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aPdZtFAWuvH05cYVHnQ3DwAAAgw&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[13] https://whitepapers.theregister.com/

xanadu42

Another Wobbly Service...

And how unexpected is it that there would be a cascade of additional events related to the first?

What a great idea

ICL1900-G3

To put your business in somebody else's hands, to pay them a premium to manage something you could do yourself, control yourself...

Re: What a great idea

Will Godfrey

Well, It's all about community spirit, doncha no.

Instead of one company having a wobbly, everyone joins in to offer 'support'

Re: What a great idea

abend0c4

The counter argument, that you get a bunch of highly-skilled operations staff working 24/7, a service you could not individually afford, unsurprisingly doesn't seem to have got much of an airing over the last few hours.

The biggest problem, it seems to me, is not the outsourcing per se (there will be a significant body of customers for whom it makes economic sense in principle if the implementation is right), but the small number of providers. The potential international economic effect of such a huge chunk of infrastructure going out simultaneously is a risk that transcends any one particular customer's interests.

These vast enterprises must be broken down into smaller units. Not only for resilience, but to encourage competition, necessitate the development of standards that allow services and data to be migrated and to restore the balance of power between consumer and provider - indeed, between nation state and provider. This is a warning that requires an urgent response.

Caver_Dave

How on Earth did we allow HMRC, with so much personal data about almost everybody in the UK, to be hosted by what is commonly being described under the current leadership as a rouge state?

elsergiovolador

They probably can't see you behind a mountain of brown envelopes.

This needs an investigation. Sounds reckless at very least.

Anonymous Coward

Because every other supplier who answered the request for tender apart from AWS and Azure shat the bed. As one of those suppliers was Fujitsu with their cloud offering (since throttled in its cradle) you can guess how badly the bed linen was soiled. Some of the other suppliers weren't as good as Fujitsu.

Crypto Monad

> what is commonly being described under the current leadership as a rouge state?

I would say more orange than rouge.

TRT

“Over time we reduced throttling of operations and worked in parallel to resolve network connectivity issues until the services fully recovered. By 3:01 PM, all AWS services returned to normal operations, meaning problems persisted for over a dozen hours after resolution of the DynamoDB debacle."

Is that East Coast time?

News: 1761024808

AWS admits more bits of its cloud broke as it recovered from DynamoDB debacle

What a great idea

Re: What a great idea

Re: What a great idea