Linux admin hated downtime so much he schlepped a live UPS during office move
- Reference: 1763707452
- News link: https://www.theregister.co.uk/2025/11/21/on_call/
- Source link:
This week, meet a reader we'll Regomize as "Bobby" who told us about an old friend of his who he suggested we refer to as "Peanut."
Peanut was a Mac tech by trade, and a Linux user by inclination.
[1]
Those affiliations combined into a fanatical appetite for unbroken uptime.
[2]
[3]
Which is how Bobby came into this story.
"The people Peanut worked for were moving to new premises, and he didn’t want to lose the 400-plus days of uptime on his mail server," Bobby explained.
[4]
"So he came up with a scheme to move the server and the UPS between buildings." Peanut, who was very strong, got the job of carrying the UPS. Bobby carried the smaller and lighter server.
Peanut's employer didn't want or need the server to remain online. Indeed, during the move it wasn't connected to anything and therefore couldn’t send or receive mail. But the somewhat daft scheme worked and the server did continue running.
"It all went really well and we got the UPS and the mail server into the new building without any mishaps," Bobby told On Call. Peanut therefore preserved his uptime streak.
[5]Developer battled to write his own documentation, but lost the boss fight
[6]Help desk boss fell for ‘Internet Cleaning Day’ prank - then swore he got the joke
[7]Actor couldn’t understand why computer didn’t work when the curtain came down
[8]New boss took charge of project code and sent two billion unwanted emails
But while Peanut kept the server alive during the office move, he lost track of how he managed the entire fleet of boxes under his care.
“He phoned me several days later and explained he had been logged into several Linux boxes using the same CLI. Then he needed to reboot one of them to complete a package update but didn’t realize that was the mail server.”
[9]
Peanut therefore shut down the box he’d tried so hard to keep running.
Have you attempted extreme uptime elongation? If so, [10]click here to let us know about your schemes, so we can share them in a future edition of On Call. ®
Get our [11]Tech Resources
[1] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offbeat/columnists&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2aSBGU_XfVVPzBb30tLwXOwAAAJc&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0
[2] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offbeat/columnists&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aSBGU_XfVVPzBb30tLwXOwAAAJc&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[3] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offbeat/columnists&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aSBGU_XfVVPzBb30tLwXOwAAAJc&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offbeat/columnists&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aSBGU_XfVVPzBb30tLwXOwAAAJc&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[5] https://www.theregister.com/2025/11/14/on_call/
[6] https://www.theregister.com/2025/11/07/on_call/
[7] https://www.theregister.com/2025/10/31/on_call/
[8] https://www.theregister.com/2025/10/24/on_call/
[9] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offbeat/columnists&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aSBGU_XfVVPzBb30tLwXOwAAAJc&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[10] mailto:oncall@theregister.com
[11] https://whitepapers.theregister.com/
Re: Smart, But Also Bloody Stupid
> This is not someone I'd like to employ as I couldn't trust them to do things properly.
You mean he should be an Exim employee?
Re: Smart, But Also Bloody Stupid
I sendmail you my script collection, 'cause only those could do real mail 100% controlled the way I want to. I'm gonna exim the building now.
Re: Smart, But Also Bloody Stupid
What a funny Exchange
Re: Smart, But Also Bloody Stupid
Peanut was a complete Yahoo
Re: Smart, But Also Bloody Stupid
He had a good Outlook though
Re: Smart, But Also Bloody Stupid
Luckily the server stayed up so he didn't need to Postfix it
Re: Smart, But Also Bloody Stupid
Lugging that heavy UPS around, I expect he was a hotmail
Re: Smart, But Also Bloody Stupid
He'll be pine-ing for his uptime now it's gone
Re: Smart, But Also Bloody Stupid
So you never ever had a dumb idea to reach a goal only you cared about, whereas most others (at best) giggled? Or in other words: You were never young?
Re: Smart, But Also Bloody Stupid
He said he'd never want to employ Peanut. That implies you are replying to management. So your first question is redundant, except for the giggling part. And Peanut acted out his own dumb idea, instead of delegating it, which is incomprehensible to some.
Re: Smart, But Also Bloody Stupid
Admittedly not with a server, but I've come across engineers moving their CAD desktops in a similar manner as "I can't shut it down, I'll lose all my stuff".
The conveyance of choice for the UPS and full size desktop tower was of course the humble office chair. Worked fine until someone managed to knock their PC off their chair onto the floor, unluckily being one of the few still running a hard drive instead of an SSD for storage.
The drop also snapped the keyboard and mouse USB plugs (he had left everything connected apart from the monitor) off in the motherboard ports and yanked the UPS power cord out of the back.
Unsurprisingly the PC wouldn't complete POST or boot when plugged back in. As I recall the engineer didn't get given a replacement, but a spare motherboard and HDD were found and stuffed in the now scuffed and dented case as a reminder to be more careful.
Re: Smart, But Also Bloody Stupid
>CAD Monitor…
Depending on when this happened, I would not want to be the one delegated to carry the monitor; people who think a 28-inch flat screen is heavy, has either forgotten or never attempted to lift a 19-inch CRT.
This story's nuts!
Those were the innocent days...
...when admins thought that Linux does not need to be updated and rebooted. When security by obscurity still worked. I was among them, long long time ago, in a galaxy far away.
Re: Those were the innocent days...
Yes 400 days uptime probably means 400 days without a kernel patch. Security? Who needs that on a mail server...
Re: Those were the innocent days...
Hold it right there: You don't need to destroy your uptime to update the mail server. You stop the mail daemon/service/pause-reload-crontab/killall -9/whatever, update it, and start it again. Your other stuff, except for the port needed for mail, are not exposed, as every admin who uses an UPS during a move to keep uptime high does it. (In reality: You SHOULDN'T need to reboot, but it does not always work out that way)
Re: Those were the innocent days...
The kernel handles the networking side, so if there's a vulnerability in socket handling or similar, it's not the mail daemon that needs updating. Kernel update basically requires a reboot.
Re: Those were the innocent days...
Kernel patch? This is not windows ;)
(yeah, there's a bunch of CVE for the Linux kernel, not sure about how severe they are. Just thinking about the frequency of new kernels my linux machines get it doesn't seem too bad, I agree, 400 days is probably stretching things)
Re: Those were the innocent days...
What! Are you suggesting MS were ahead of the curve when they shipped W95/98 with an effective maximum up time of 49.7 days…
Not only patches
But many exploits are not permanent, and go away if you've rebooted - especially if your OS partitions are read only. Now obviously if it isn't patched it the same hole can be re-exploited after the reboot but that isn't automatic (because if it is then you have a simple way to trigger the exploit making it all to easy to learn how it works so it can be permanently fixed)
> This week, meet a reader we'll Regomize[sic] as "Bobby"
Isn't Bobby an expert on database tables?
Nah, that’s Bobby’s mum!
Well, someone had to do it
[1]Obligatory
[1] https://xkcd.com/705/
Re: Well, someone had to do it
Blast, beaten to it; this is what I get for switching off the 07:00 alarm.
Re: Well, someone had to do it
Yup. Be afraid… very afraid !
As for Hell no fury… scorned women aren't in the race.
Seen it before.
Routers with hundreds of days of uptime, but these days there are so many patches and fixed for machines that most of the network rarely gets more than 6 months before a reboot.
As part of our Sox compliance we need to be on venders gold image or gold minus 1 with the intent of going to gold within three months of an image being released.
However if you have a dual supervisor switch this sort of uptime record is possible even with software patching as the chassis will stay up just the processor modules will swap over the control. Some people may argue this is downtime but if it stays up then I don’t.
Re: Seen it before.
That is the old method. Today the cheaper way is to cluster two or more servers, and on top you get: One box can always be subjected to physical issues.
Re: Seen it before.
logged into a 6500 the other day that'd been up 18 years
out of interest, how do you deal with vendors that ship & constantly update multiple trains of code?
like cisco 10.3.x/10.4.x, do you go 10.4 even though you don't need the features?
Re: Seen it before.
I once had a commodity desktop PC repurposed as a firewall/router which had clocked up a runtime in excess of 3000 days before a prolonged power outage caused the UPS to run out of battery and reboot it (to be honest I was surprised the UPS still held a charge after that long..)
[It was a 1998 vintage 350MHz Pentium-II powered Dell Optiplex with 320Mb RAM and a 6Gb HDD, originally running Windows 95, repurposed to run OpenBSD, with no exposed services and connected to an internal network, just running internal DHCP and packet filtering for two door controllers and an environmental sensor on a legacy bit of network - it had originally done a lot more, but the rest of the building had been demolished.
I kept it running purely to see how long I could until I finally decommissioned it in 2022 after the power outage...]
Re: Seen it before.
We had a PC running in a machine control cabinet continuously for about 7 years, nothing fancy just a bog standard mini-tower, until one day someone hit the shut-down option on the menu (I never did find out why). Unfortunately the HDD decided not to restart* and although not strictly an IT department problem (I hadn't even known about it) I tried all my tricks but nothing.
Me: "Backups?"
Users: "What are they?"
Me: "Software installation disks?"
Users: "Errm, well we had them somewhere!"
When the software disk finally arrived from the US (single floppy but with some odd DRM so it had to be a physical disk not a download) and I found a spare HDD we got it going again for a while until I could be bothered to replace it with a newer PC.
*Not that unusual with machines that hadn't been shut down in a long time.
Insame
Presumably, this machine would have spinning rust disks. They moved a machine, a server no less, while the disks were still spinning? How far? To an adjacent building or using a car/van?
Insanity.
Re: Insame
I was all set to spark off about how much spinning disks love to be jolted, especially with the kind of forces for rack mounting or the weird angles whilst being carried. Then I remembered that it's Linux and right at the back of my head something glowed - HDPARM allows you to park the heads of a disk. *Arguably* you could keep the uptime counter running with the OS in memory by parking the disks and pausing/stopping a load of services. Still not sure I'd do it mind.
Re: Insame
The problem with that approach is that if any disk based operations fires off, then the first thing it will do is to un-park the heads so that it can perform the operation.
So, this would only work if the discs were completely inactive and that is virtually impossible for a running OS.
You would also assume that if its a proper server, it would have an array of discs in it.
I'd feel safe if Bobby and Peanut were next to me if I was dying and could only have my life sustained by a life support machine connected to a UPS.
I would draw the line at Peanut interacting with the machine though.
Back in the mid 00's I was a pre-sales tech support at a security equipment distributor; my realm being the nascent world of IP CCTV.
One of my customers was working on a system with a hefty non-disclosure agreement attached, so I couldn't know the end user, or the details of how the system would be installed or used; I only had the basic details of the number of cameras, recording rates etc. From this I had to spec the type of cameras, recording servers, storage, managed PoE switches and a UPS to keep the system alive for 6 hours.
I queried the UPS, suggesting a smaller UPS and generator, but the end user had insisted on using a bank of UPSes.
The system was duly specced, installed and commissioned by my customer and I heard nothing else for 6 months until I got a support call on Monday morning. The customer had experienced a long powercut over the weekend, all was good, the recording servers had run for the full 5 hours the power was down, but they hadn't recorded anything.
My forst though was the cameras, which were all indoors and this not fitted with infra-red illuminators, had recorded, but you couldn't see anything because the lights were out.
No, the guy says, the site had emergency lighting, so there should be footage.
As his customer was important to him, my customer paid to fly me to site; which turned out to be a large casino, gentleman's club and a slightly more exclusive gentleman's establishment in Berlin.
I checked the servers, and indeed they'd functioned all weekend, losing connection to the cameras the moment the power to the site dropped, so I checked the core switch, which had also stayed up, but had lost connection to all the PoE switches the moment the power cut.
It seems the installer had only connected equipment in the security suite to the UPS, so all the edge switches dropped with the power, and all the cameras dropped, as they were drawing power from the switches.
I pointed to a paragraph in my response to the tender, were I emphasised the importance of all switches being connected to the UPS, grabbed my coat and hopped on the next flight home...
Mea culpa !
I have put a UPS on a lab trolley and moved a mail server with redundant power supplies—replace one mains lead with UPS lead then the other; then trundle the server to its new location; replug the mains leads. The network is down for the duration of the transfer (a few minutes) but was on the same vlan.
One half of the building had power from one substation and the other half from another substation. Also two large network rooms. When one substation transformer had to be upgraded/replaced at short notice both the power and networking would lost for a day. So relocating the mail server to the other side of the building made sense as the amount of email piling up on the backup MXers would seriously thrash the server The startup was also pretty slow so not shutting the server down was also desirable.
The part of this circus I found funniest was riding with the "mail server" down two floors in the lift (elevator.)
If the server had a wifi interface I could have kept the network up too. Next time. ;)
Checkpoints
"This is my current simulation. It's been running for almost three weeks now; I expect it should complete and give me my results in a few days. Thank God I've got an uninterruptable power supply for this thing."
"$@#&+/%%¢£¥∆§π!! Quick, help me pack this thing up!"
"The elevators are automatically locked out during a fire alarm. You want me to help you take this thing down the stairs? ..."
I worked on a mainframe OS which had a thing called "checkpoints". You invoked them from within your program. If something bad happened, you could resume your program from the latest checkpoint (or an earlier one, if you preferred) and not lose all of the computer's previous work on your program.
Re: Checkpoints
I've manually coded in checkpointing at various times over my career for long-running simulations -- "long-running" for me being a week or more; but I've done some that took months to run :-/
Ok
I have a headless Raspberry Pi 5 Lite TV recorder, thinned out so it only has what is needed. It is on a UPS and regularly goes for >200 days uptime. I do admit to a slight feeling of disappointment when I reboot it.
Re: Ok
Sounds like the UPS itself would draw more current than the equipment. Both in therm of "generally" and when it actually needs to kick in.
Re: Ok
Possibly: the TV head receiver, WiFi modem router, and a Pi backup server are also on the UPS. Running them all, the UPS runs for >150 minutes, more than long enough to record a program.
Monkeys
Well, they say if you pay Peanuts, you get Monkeys.
I guess that would be helpful when moving a heavy UPS though.
Uptime shmuptime
I do not have Peanut's "fanatical appetite for unbroken uptime."
I reboot a bunch of SQL servers that make up a Data Warehouse weekly , because they enjoy it , Its like a weekly treat for them
I know I shouldn't anthropomorphise the I.T. hardware though ... they hate that !
Clever but stupid
This entire stunt - unnecessarily endangering the company email server - was pulled by someone desperate to show Linux was better than Windows in terms of uptime, wasn't it?
Uptime is a measure of how long it's been since you last successfully booted.
Although this story really reminded me of [1]these folks who transported a live server, and it's UPS, across Hamburg, together with a mobile 3G link to keep it online. On public transport, in the rain, just to make it more fun.
Oh, and the server only had one power connection, so they soldered a second power connection to the board while it was powered on .
[1] https://www.youtube.com/watch?v=vQ5MA685ApE
Ducks for cover
Going by the tone of the comments here, I'm gonna be judged.
I work for a customer that have no dedicated space for their IT kit and also love to have ideas on rearranging their office. Depending on who's doing the rearranging, they'll always elect to move the rack to the room they don't personally care about. So it gets relocated on a semi regular basis.
I've worked out that if I take the doors off their hinges, you can wheel the rack through them so it doesn't need to be stripped down and rebuilt every time.
I've also worked out I can wheel it across the entire ground floor within the runtime of the UPS....
Smart, But Also Bloody Stupid
I can understand why he wanted to maintain uptime on the mail server, so he really did think going down this route was a cunning plan. However, if anything had happened to that server or UPS whilst being moved between sites he would almost certainly have lost the server either through physical damage and/or data loss, so the plan was totally stupid and he should have just not done it. Powering on a server that has been shut down correctly and then moved is always a tense moment, but putting yourself in the position of potentially having to replace hardware or even the server, or rebuilding the server and restoring from backup (you did make one, didn't you?) with all that hassle and impact to the business just so you can say, "Look, the mail server has been up for xxx days now." is completely unwarranted and foolhardy in the extreme. This is not someone I'd like to employ as I couldn't trust them to do things properly.