News: 1763471985

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Cloudflare coughs, half the internet catches a cold

(2025/11/18)


Updated Internet services provider Cloudflare is suffering a major outage that has knocked chunks of the web offline – including The Register .

The company acknowledged problems at 1148 UTC on November 18, [1]stating : "Some services may be intermittently impacted." After a long half-hour, it reckoned systems were returning to normal, but "customers may continue to observe higher-than-normal error rates" as engineers continue to investigate and fix the underlying issue.

Cloudflare provides security and infrastructure for a substantial chunk of websites. As such, X (formerly Twitter) and even El Reg were either knocked offline or malfunctioned as the outage continued. Even that stalwart of system uptime, Downdetector, reported "Please unblock [2]challenges.cloudflare.com to proceed" at one point.

[3]

Cloudflare has yet to confirm the cause of the outage – we will issue an update when it does – but it follows hot on the heels of problems at AWS and Azure, and is a reminder for enterprises that a service is only as good as the weakest link in the chain... and that weakest link might not reveal itself until it breaks.

[4]

[5]

The problem appears to be global, and the company was forced to do the equivalent of turning off and on its WARP access in London as engineers worked to deal with the glitch. WARP is similar to a VPN, except it routes traffic through Cloudflare's network. If the network is having a bad day, turning off WARP seems a sensible option.

At 1309 UTC, Cloudflare announced it had identified the root cause and a fix was being implemented. It did not, however, give an estimate for when sites would stop becoming available and then become unavailable again, seemingly at random.

[6]

A Cloudflare spokesperson told The Register : "We saw a spike in unusual traffic to one of Cloudflare's services beginning at 1120 UTC. That caused some traffic passing through Cloudflare's network to experience errors.

"We do not yet know the cause of the spike in unusual traffic. We are all hands on deck to make sure all traffic is served without errors. After that, we will turn our attention to investigating the cause of the unusual spike in traffic." ®

Updated to add at 1555 UTC, November 18

A Cloudflare spokesperson told The Register that the incident began at 1120 UTC and was fully resolved at 1430. They said: "The root cause of the outage was a configuration file that is automatically generated to manage threat traffic. The file grew beyond an expected size of entries and triggered a crash in the software system that handles traffic for a number of Cloudflare's services.

"To be clear, there is no evidence that this was the result of an attack or caused by malicious activity.

"We expect that some Cloudflare services will be briefly degraded as traffic naturally spikes post incident but we expect all services to return to normal in the next few hours.

[7]

"Given the importance of Cloudflare's services, any outage is unacceptable. We apologize to our customers and the internet in general for letting you down today. We will learn from today's incident and improve."

Get our [8]Tech Resources



[1] https://www.cloudflarestatus.com/

[2] https://challenges.cloudflare.com/

[3] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_onprem/networks&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2aRymJoGXjFhTPlcjq-JiSwAAAMs&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0

[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_onprem/networks&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aRymJoGXjFhTPlcjq-JiSwAAAMs&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[5] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_onprem/networks&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aRymJoGXjFhTPlcjq-JiSwAAAMs&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[6] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_onprem/networks&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aRymJoGXjFhTPlcjq-JiSwAAAMs&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[7] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_onprem/networks&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aRymJoGXjFhTPlcjq-JiSwAAAMs&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[8] https://whitepapers.theregister.com/



Twitter unaccessible

MrMerrymaker

Always a silver lining

Re: Twitter unaccessible

Anonymous Coward

Can't lie, it's been great for my productivity today

Re: Twitter unaccessible

Anonymous Coward

But somebody on the internet is *still* wrong. You're going to have a busy night.

https://xkcd.com/386/

Let me guess ... DNS

JimmyPage

So no surprise there.

Re: Let me guess ... DNS

Antron Argaiv

We'll find out the true story in next week's "Who, Me?"

Re: Let me guess ... DNS

Anonymous Coward

Cloudflare DNS kept working all through this, so unlikely this time ... their proxy solutions though, not so much

Re: Let me guess ... DNS

Nick Stallman

I'm thinking configuration layer.

Half of my stuff kept running mostly fine, and the other half was totally cactus.

At least we know there'll be a good write up coming soon.

"Breaking Internet services provider Cloudflare"

seven of five

nice one...

Diogenes8080

In other news today, the temporary unavailability of Cloudflare Turnstile reduced phishing attacks by 70% for the duration of the outage.

How do you create a single point of failure for a chunk of the net?

Tron

Have one company gating access to a chunk of the net.

Re: How do you create a single point of failure for a chunk of the net?

Wibble

And DNS?

Re: How do you create a single point of failure for a chunk of the net?

Wizardling

Hell, I don't even use Cloudflare DNS, but the one I do use - ControlD - won't connect with my VPN this morning for any new connections.

So my phone is still online. But I can't get my PC connected after booting it up.

Re: How do you create a single point of failure for a chunk of the net?

Mage

Yes.

https://www.mobileread.com/forums/showthread.php?p=4549332#post4549332

Solar flares are a risk too if strong enough.

To be clear, there is no evidence that this was the result of an attack or caused by malicious activity.

Indeed the hypothesis in "No Silver Lining" is that it's two separate automatic updates rushed out on a Friday evening. Major Internet disasters are likely always stupidity / accident rather than malicious. Except when the next big Solar flare hits. That shouldn't affect the the internet, but it will due to people thinking GPS and similar is good for timing. Beancounters saving small amounts It should only be used for navigation, because it's vulnerable to space weather and human attack.

Re: How do you create a single point of failure for a chunk of the net?

Alan Brown

GPS time is a good NTP source but you have to ensure your configuration knows what to do when the source goes insane (the mechanisms are built in. There's an assumption that every clock source will break at some point and that's why you always check multiples in your local server)

Re: How do you create a single point of failure for a chunk of the net?

Alan Brown

People have been warning about this since the tier1s emerged in the 1990s

Rampant capitalism and profits trump everything though

Single point of failure

Dr Who

Routing traffic to your site via Cloudflare has always seemed odd to me. What's the fallback option? Is it easy to switch back to routing requests directly to your service when Cloudflare is glitching or unavailable? If it's as simple as changing an entry or two in your DNS zone then I suppose it's not too much of a problem. Busy sites though may not be able to support the load of doing that if they were using Cloudflare's content distribution.

Re: Single point of failure

Oblivium

Agreed.

Perhaps someone a El Reg can comment - this very site was unavailable for a good while.

Re: Single point of failure

cyberdemon

> Is it easy to switch back to routing requests directly to your service when Cloudflare is glitching or unavailable?

Well, El Reg was up and down synchronously with some other sites for an hour or two; but now seems it is consistently up, while the other sites are still down. So maybe they did exactly that?

Re: Single point of failure

Anonymous Coward

This is the modern day version of "no-one gets fired for buying IBM". If half of the Internet is in the same boat no-one is in trouble. Having a fallback is expensive and unnecessary.

Re: Single point of failure

Doctor Syntax

If half of the Internet is in the same boat no-one half the internet is in trouble.

FTFY

Re: Single point of failure

Eye Know

Exactly. You can do the right thing, but it's gonna cost and require expensive skills or you can use as much effort as is affordable, just like everyone else.

Re: Single point of failure

Charlie Clark

Actually, CloudFlare and other CDNs are popular because they reduce the risk of single point of failure. 10 - 15 years ago DDoS launched by script kids were starting to cause so much trouble for sites, and even the data centres they were hosted at, that CDNs were a godsend. And things have only got worse since then. Given CloudFlare's track record and the fact that it's really just a thin proxy and not something that runs your applications, I think that's a reasonable risk/benefit approach.

Re: Single point of failure

Anonymous Coward

I remember the pre-CloudFlare days, DDoS was pretty much a fact of life for every website. I don't host myself, but I hear it's so bad now that if you're not under a big CDN you're basically guaranteed to get buried in DDoS attacks (and now probably AI data harvesters, too). Not so much CloudFlare's fault for existing as it is that there aren't really many other good options.

Re: Single point of failure

Alan Brown

I can think of a few good options - such as declaring DDoS to be a terrorist activity, the treating the culprits accordingly

I wonder how many people

JimmyPage

who have been blagging it as an "expert" will have been uncovered by this, with ChatGPT down for an hour ?

Who, me?

Dave559

Hopefully someone, suitably regomized, will be sending an email for the "Who, me?" column in the next few days or weeks about how they managed to thagomize half of the internet this time…

(Otherwise, they'll need to be keeping an eye out for freshly delivered rolls of carpet and poorly maintained lift doors for quite some time…)

Re: Who, me?

I ain't Spartacus

That's something I'd love to see in a corporate statement. Instead of the boring, "lessons have been learned" in an after hack / downtime / breach report the corporate-speak was changed to, "new procedures have been introduced to avoid future breaches of our system. Including supplies of rolled up carpets prominently on display, underneath signs saying Death to all employees who reveal their passwords!"

Re: Who, me?

Nick Stallman

Cloudflare has always been excellent for their write ups of their issues.

Check their blog in 24 hours time.

Re: Who, me?

KarMann

I wish I could give you a bonus upvote for use of 'thagomi[sz]e'.

Icon: The only dinosaur left amongst the icons.

SPOF

elsergiovolador

CEO: “Why’s the entire company offline?”

CTO: “Cloudflare sneezed.”

CEO: “But you said they were the gold standard.”

CTO: “They are. That’s why everyone uses them. That’s also why when they faceplant, the entire internet turns into Victorian London fog. It’s a feature.”

CEO: “Didn’t we have a risk session where you said something about avoiding single points of failure?”

CTO: “Yeah, yeah, but that’s for the little people running Raspberry Pis in their garage. Real enterprises consolidate all traffic through one giant benevolent megacorp, because… economies of scale. Or something. I skimmed the brochure.”

CEO: “But surely we architected a fallback?”

CTO: “Absolutely. If Cloudflare ever goes down, we… wait for Cloudflare to come back up. Solid plan. Industry standard.”

CEO: “So our customers can’t access anything?”

CTO: “Not unless they enjoy watching 522 errors in different fonts. On the bright side, this is the most distributed downtime we’ve ever had. Global reach. Brand consistency.”

CEO: “Should we reconsider relying on one vendor for DNS, CDN, WAF, analytics, TLS termination, routing, edge compute, zero-trust, VPN…?”

CTO: “Look, we put everything behind Cloudflare because we wanted simplicity. Now we have perfect simplicity. Nothing works, equally, everywhere.”

CEO: “So what do we tell the board?”

CTO: “Say it was a spike in ‘unusual traffic’. That phrase is magic. Makes it sound like we know what we’re talking about.”

Re: SPOF

Cris E

It's not a single point of failure, it's standards compliance.

Re: SPOF

Bebu sa Ware

« CEO : “ Should we reconsider relying on one vendor for DNS, CDN, WAF, analytics, TLS termination, routing, edge compute, zero-trust, VPN…? ”»

If anyone from the C-Suite spouted that I would be checking myself in for urgent psychiatric assessment as clear my admittedly tenuous grip on reality had finally failed. As plot dialogue it's not credible. These chaps have trouble working a light switch.

« CEO : “ But you said they were the gold standard. ”»

Clue: the gold standard was obsolete and counterproductive over fifty years ago.

Re: SPOF

vtcodger

"CTO: “Not unless they enjoy watching 522 errors in different fonts.""

Quite the contrary, I found the Cloudfare error screen rather attractive. Of course, I wasn't trying to do anything important like work.

Lazlo Woodbine

I thought the original idea of Darpanet was to automatically route packets around breakages...

Antron Argaiv

Offer void when outage involves backhoe or "protective gateway"

42656e4d203239

>>I thought the original idea of Darpanet was to automatically route packets around breakages...

Routing stiIl does.

Trouble is that the re-routing fails when many routes point to Cloudflare and the target server is behind Cloudflare's infrastructure with no "real world" IP address, so then any re-route attempt will just wind up hitting a Cloudflare gateway... and consequently be, effectively, black-holed.

takno

It was, but the original designers built it that way because they thought the traffic would actually be important. The modern internet has very different characteristics.

Alan Brown

It was (and still is) but the emergence of the backbones (Tier1 providers) in the 90s as telcos took over started concentrating the logical paths across the same phyisical circuits

Ironically, most telcos had very robust routing systems for voice to ensure that backhoes and friends only ever caused local glitches. That all went out the window when they switched from being service oriented to profit maximisation after the AT&T breakup of 1982. Even non-USA telcos were affected by the change in industry mindset and by the early 1990s the MBAs and quantity surveyors had taken over (always keep your aardvark on a short leash)

Ruined Lunchtime

Anonymous Coward

Couldn't read The Register during lunchtime, which sucked. Had to go highbrow and read Ars Technica instead.

Re: Ruined Lunchtime

werdsmith

It's fixed now, which is disappointing because I was looking forward to an early finish today.

Re: Ruined Lunchtime

Bebu sa Ware

" Had to go highbrow and read Ars Technica instead. "

Umm Arse Technical…Highbrow ?

You are a courageous chap, I give you that.

Eating lunch and taking the risk of reading one of Beth Mole's "challenging" articles and retaining it—your lunch I mean.

Yes I guess Ruined Lunchtime would cover that… quite umm nicely… and your keyboard.

Re: Ruined Lunchtime

Anonymous Custard

Think yourself lucky, it ruined half of my day.

I actually had to do some work as everywhere else was down...

Funny The Register Could Not Stay Up

AnAnonymousCanuck

This morning was the morning we identified which network properties have competent network and system administrators.

I was extremely surprised to find The Register was not in that group.

There was no need for a SPOF 25 years ago. The fact that people choose to have them now, blows my mind

YMMV

AAC

Re: Funny The Register Could Not Stay Up

Anonymous Coward

Interested to hear your suggestion for a backup to Cloudflare, with a similar feature set to cloudflare, that you can economically deploy for this kind of situation

No? Great chat

Re: Funny The Register Could Not Stay Up

Jason Bloomberg

with a similar feature set to cloudflare

Does it need to have similar features to Cloudflare? Would a limited set of features suffice and be better than being dead to the world?

Re: Funny The Register Could Not Stay Up

Alan Brown

That all works fine until cloudflare goes titsup, at which point the competition does too thanks to the extra load

Network connectivity is an odd fish. Unlike analog systems which get progressively noiser, It works ok right up to the point it doesn't and then it.... doesn't.

Recovery always takes the load dropping back well below the original breaking point too

I've pulled the "I warned you" on a number of occasions when manglement ignored warnings that the system was getting to the cusp of overloading failure, then went into headless chicken mode when what they'd been handwaving away actually happened. It doesn't matter how much money you throw at it, it CANNOT be fixed "right now" and if the previous requests for upgrades were heeded, it would have cost 1/10 of what it's now going to cost.

The biggest risk from a MBA point of view is X hundred people not being able to do their jobs. If you paint the consequences in those terms they might get the message that spending £100k on upgraded kit is cheaper than losing £500k in an afternoon of people sitting around twiddling their thumbs, let alone any contractual damages that might happen.

Re: Funny The Register Could Not Stay Up

MatthewSt

Either AWS or Azure have equivalent services available on a (nearly) PAYG basis. They can be configured as your backups, switch over to them (maybe even automatically) when Cloudflare has a wobble.

Pfffft

Camilla Smythe

I may as well set my home page and new tabs to a Cloudflare Error page.

WARP

Dave559

If they have problems with their warp cores, they clearly need to employ greater numbers of skilful and experienced Scottish engineers (and have a few bottles of whisky available to assist with post-incident recovery processes - the real stuff, none of your synthehol muck, laddie!)…

Re: WARP

David 132

Or pick up the mouse, and use it as a microphone to say “Hello, Computer”?

This is NOT a repeat.