News: 1742383869

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

'Once in a lifetime' IT outage at city council hit datacenter, but no files lost

(2025/03/19)


Nottingham City Council continues to deal with the fallout from its freak IT outage from last week as it confirms in-house IT specialists managed to prevent any data from going missing.

The council told The Register that despite the outage affecting much of the council's digital infrastructure, some of which remains down today, there is no risk to residents' data.

"Our IT teams have successfully worked around the clock, so we haven't lost any data," a spokesperson said. "This is why we've had to take time to power up systems – to ensure no loss of data as we bring back online [sic]."

[1]

The news comes as the council restores a number of its systems following a power outage caused by an electrical fault at Loxley House, its city center headquarters.

[2]

[3]

A failure was detected within the electrical safety circuit of the building's high-voltage switchgear, a [4]statement read.

"The power outage subsequently impacted the council's central [5]datacenter , which houses computing and networking equipment for the whole council, " the statement went on to say.

[6]

"The serious nature of the unprecedented outage meant that all systems went offline, leading to disruptions to our phone lines and systems while we work to test systems before bringing them safely back online."

As of Monday, the council said it was able to restore its phones and other systems but encouraged all residents with emergencies to report them via email in the first instance since they were expecting a high volume of calls.

The outage started on March 13 and continues to affect most online services, according to the council's website.

[7]

A banner no doubt penned by the tech team still advises residents that services may seem functional, and forms will load, but they may not work correctly. However, a [8]council spokesperson contradicted this in a statement given to The Reg .

"The majority of our services are running and have been since last week, it's just that some have been working differently as a result of loss of payment systems and contact centers," they said.

Residents can expect normal service to resume later this week, according to the council's website, which also stated it would take some time to carry out the necessary repairs and tests before signing off on a "fully stable" power supply.

Council chief Sajeeda Rose praised staff for working through the disruptions and for adapting at "lightning speed" to the council's business continuity plans.

"Unexpected disruptions like these test our ability to adapt, and I want to take a moment to acknowledge the incredible resilience, patience, and professionalism colleagues have all shown in response to this unexpected challenge," she said.

[9]Hyperoptic customers left in dark as power outage takes down systems

[10]Mystery border control outage causes misery at Malaysia/Singapore frontier

[11]Internet Archive blames 'environmental factors' for overnight outages

[12]Techie saved the day and was then criticized for the fix

"While we are making progress, we appreciate that this situation has been disruptive. We appreciate the public's continued patience and cooperation. We will keep you updated as much as we can."

BBC Nottingham reporter Hugh Casswell shared an image of the council's executive board meeting on Tuesday from Loxley House with the effects of the power outage on display.

Members assembled in the barely lit room, sat around desks with no light and the wall-mounted TV plugged in but not operational.

"We're told a risk assessment has been carried out and it was deemed safe to hold the meeting as the room is on the ground floor and lit by natural light," [13]said Casswell. ®

Get our [14]Tech Resources



[1] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_onprem/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2Z9r4NTK4FuHbq-6fef4ONQAAAMA&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0

[2] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_onprem/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44Z9r4NTK4FuHbq-6fef4ONQAAAMA&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[3] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_onprem/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33Z9r4NTK4FuHbq-6fef4ONQAAAMA&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[4] https://www.mynottinghamnews.co.uk/nottingham-city-council-power-outage-update/

[5] https://www.theregister.com/2024/10/15/uk_datacenter_investment/

[6] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_onprem/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44Z9r4NTK4FuHbq-6fef4ONQAAAMA&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[7] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_onprem/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33Z9r4NTK4FuHbq-6fef4ONQAAAMA&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[8] https://forums.theregister.com/forum/all/2023/01/23/leeds_city_council_erp/

[9] https://www.theregister.com/2025/01/29/hyperoptic_outage_scotland/

[10] https://www.theregister.com/2024/12/10/johor_bahru_singapore_outage_immigration/

[11] https://www.theregister.com/2024/07/08/internet_archive_suffers_a_wobble/

[12] https://www.theregister.com/2024/04/05/on_call/

[13] https://x.com/HughCasswell/status/1901998318772748674

[14] https://whitepapers.theregister.com/



So...

Mentat74

'Same as it ever was'...

Did they try turning it off and on again?

Mishak

Oh, only the "off" worked...

But seriously, why are systems these days so fragile that a power loss means it takes so long to get them back to an operational state?

Re: Did they try turning it off and on again?

A Non e-mouse

You usually test turning off one IT system. It's very rare that you test turning off your entire IT estate and then bring it all back online. You usually discover circular dependencies. The classic being you can't login to your SAN/VM farm as authentication is via your local AD - which is hosted on said platform which is offline because you can't login to turn it back on again.

Re: Did they try turning it off and on again?

Korev

We had a similar thing in a DR practice in an old job. The lorry turned up and we entered with the backup tapes. We'd always assumed that the rest of the Company would be up so stuff like AD would be available. The problem here was that we couldn't connect to the real network as the test recovered systems would likely clash with the real ones. In the end we managed to lay our hands on a DC backup and we were in business.

Re: Did they try turning it off and on again?

0laf

DR site is the next cupboard along

Looks like it affected their DR site too?

Steve K

Looks like it affected their DR site too if they doidn't fail over to it?

(Or do they not have a DR site....?)

Re: Looks like it affected their DR site too?

Anonymous Coward

D site only. :-)

Re: Looks like it affected their DR site too?

Pierre 1970

Fail to plan is planning to fail.

Shirley either they don't have a proper DR site or more porbable, they just don't have a DR plan..... or they do have one with RTO of above 1 week :-)

Re: Looks like it affected their DR site too?

Anonymous Coward

I’m guessing every halfway decent sysadmin thinks about these issues. But at a certain point it’s about risk assessment. How big is the chance that site A fails ? What is the cost of a DR site (with regular failover testing) ? What is the cost of being down for a week if a full disaster hits, and what is the cost of “testing” that scenario ?

And who foots the bill ? In this specific case I assume the taxpayer. It might very well be that it’s actually cheaper to take the risk, even if RTO is above 1 week.

I have 2 sites, a generator at the primary site, and I can failover quickly and (mostly) automated. But I also know that the setup we have could be much improved. But the cost outweighs the risks.

Re: Looks like it affected their DR site too?

RockBurner

Put it another way.....

While the systems are down, the council can't SPEND any money either....

Re: Looks like it affected their DR site too?

Phil O'Sophical

But at a certain point it’s about risk assessment.

That's the first part of any business continuity planning - decide what might go wrong, how probable each event is, and how much damage it will do if it happens. That gives you a prioritised list of things to prepare for, and then you can work out what the cost of protecting against each is compared to the business cost of not implementing them.

One of the most irritating things I found about customers looking for DR was how many of them would say "I've bought your DR product, how do I configure it to protect my business", which is completely backwards.

Where do you site your backup power?

Mike 137

" Investigations have now found that the likely cause was a failure within the electrical safety circuit of the high voltage switchgear at the Council’s HQ. This meant that when the power went out, the electricity generated by backup generators couldn’t get back into the system to power anything. " 1

Commiserations to the council are in order. At some point there must be switchover gear between mains and backup, and when that gear fails there's really no way to stay live however you've designed your backup power.

1: [1] Nottingham City Council power outage update

.

[1] https://www.mynottinghamnews.co.uk/nottingham-city-council-power-outage-update/

Re: Where do you site your backup power?

Anonymous Coward

We've experienced that in our data centre. The switchgear that controls grid Vs local generator fails and you're fubared.

It is incredibly hard (if not impossible) to remove all single points of failure from a complex system. (And sometimes that SPoF is a human)

Working differently

Henry 8

"Our services are still running, they're just working differently". I'll have to remmber that one the next I announce unexpected downtime to my users!

Re: Working differently

PerlyKing

This must be up for some sort of newspeak award!

Re: Working differently

RT Harrison

Reminds me of the Morcambe and Wise sketch where Eric Morcambe is explaining to Andre Previn that he is "playing all the right notes but not necessarily in the right order."

[1]Morecambe and Wise - Andre Previn

[1] https://vimeo.com/479336770

Too old for this sh*t

I had an IFA client that was brought out by a much bigger company. The outgoing client had a pretty good DR plan in place which I helped develop and tested over the years and often did mock testing. However the incoming company had DR on a whole different level. They actually had a separate office in a different location, fully kitted out and I mean everything, right down to a kitchen. It was just sitting there unused. They tested it by sending staff over, just take milk, coffee and lunch.

The amount of money that room must be costing the company 'just in case' must have been eye watering

Steve Foster

"The amount of money that room must be costing the company 'just in case' must have been eye watering"

Presumably, the amount of money lost if things went sideways was even more eye watering (perhaps spending £200m to save £3bn, to pluck random large numbers from the air).

tip pc

a finance place i worked at had a couple of those solutions in place, Space for office & call centre staff, not everyone but enough to keep going.

several times a year some people would go over for a few hours and test connectivity for DR documentation tick box exercise.

sometimes all you need is some space in a locked rack in a DR office with an uplink, fw's & switches for connectivity to the mother ship & local LAN connectivity can be spun up within a few hours.

a risk assessment has been carried out and it was deemed safe to hold the meeting

Howard Sway

You had to conduct a risk assessment to determine whether it was safe to have a meeting in a room where the lights weren't working, but had a big window that let light in? I think some of that risk assessment budget could be better spent on your IT systems if you ask me....

Re: a risk assessment has been carried out and it was deemed safe to hold the meeting

RT Harrison

They probably do risk assessments on the use of paper. You can get nasty hurty paper cuts if you don't hold the paperwork correctly.

Props to the IT dept

Missing Semicolon

No data loss. The backup system was truly battle-ready. It looks like whilst they could not afford a fail-over system, they spent enough money on backup.

We can defeat gravity. The problem is the paperwork involved.