Final step to put new website into production deleted it instead
- Reference: 1771227009
- News link: https://www.theregister.co.uk/2026/02/16/who_me/
- Source link:
This week, meet a reader we'll Regomize as "Tom" who in 2009 contracted for major supermarket chain that operated a website to order groceries and decided it needed another one on which to sell general merchandise.
"We worked for about 18 months on the new site and one of my responsibilities was building out the development environments and defining the production deployment processes," Tom told The Register .
[1]
On arrival he found the production environment in "a bit of a mess when I got there" due to poorly documented patching. He also found "absolutely massive" deployment scripts.
[2]
[3]
"One of the things I did was remove 6,000 lines from the scripts to attempt to make them more manageable," Tom wrote. "And the scripts were only part of the process, I had to define all the other steps needed and script what I could, or document what I couldn't."
While Tom tidied things up as best he could, the supermarket wouldn't let him touch production systems – only employees were allowed to make those critical keystrokes.
[4]
After months of work, the new site was ready.
"We needed to deploy our general merchandise patch on top of the existing groceries site," Tom explained. "We had carried out multiple dry runs, deployed and rolled back in pre-production a number of times. And we had a four-hour window from 2:00 AM to 6:00 AM to when the business would allow the site to be down for this process."
Tom sat next to the employee who was allowed to make the change.
[5]
"I had supplied all the steps in detail, and all he really needed to do was cut and paste a few commands," he wrote.
But the employee decided to do it his way: Instead of deploying to each server in turn, he opened PuTTYCS – a tool that can send commands to multiple machines at once – and tried to update all the servers at once.
[6]Tech support chap invented fake fix for non-problem and watched it spread across the office
[7]Techie's one ring brought darkness by shorting a server
[8]ATM maintenance tech broke the bank by forgetting to return a key
[9]Techie banned from client site for outage he didn't cause
The staffer did ask Tom to confirm that the first step in the upgrade process was to remove the contents of one directory.
"Yes, just clear that directory," Tom replied, then watched in horror as the staffer ignored the command in procedure and instead typed rm -rf * – the "delete everything" command that often gets readers into trouble.
"Because he used PuTTY CS, the command went to every production server at once," Tom pointed out.
This happened at 02:00 AM and by then Tom had been working since 08:30 AM the previous day, after rising far earlier to make the two-hour commute to the supermarket giant's premises.
"Maybe I wasn't sharp enough at that point to catch him before he hit Enter," he mused. "I think I managed to get an anguished 'Nooooo!' just as he hit it."
Time for sitrep: It's just after 02:00 AM, Tom is exhausted, and the supermarket's entire production department is gone.
The next four hours of his life involved frantic server rebuilds, application installs, patching, and restoring things to the state in which the infrastructure was ready for the upgrade that should have taken a few minutes.
And it worked.
At 07:00 AM the supermarket chain's e-commerce director arrived and asked if the upgrade went well.
"No issues," said one of the supermarket's staffers.
"They went to get coffee," Tom concluded. "I went to get some sleep in the break area."
Have you failed to follow a procedure and flamed out afterwards? If so, don't make another mistake by not sharing your story with Who, Me? Instead, [10]click here to send us an email so we can share your story on a future Monday. ®
Get our [11]Tech Resources
[1] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offbeat/columnists&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2aZL4zz6bEVXH9gHcNHm-6QAAAoo&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0
[2] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offbeat/columnists&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aZL4zz6bEVXH9gHcNHm-6QAAAoo&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[3] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offbeat/columnists&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aZL4zz6bEVXH9gHcNHm-6QAAAoo&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offbeat/columnists&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aZL4zz6bEVXH9gHcNHm-6QAAAoo&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[5] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_offbeat/columnists&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aZL4zz6bEVXH9gHcNHm-6QAAAoo&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[6] https://www.theregister.com/2026/02/09/who_me/
[7] https://www.theregister.com/2026/02/02/who_me/
[8] https://www.theregister.com/2026/01/19/who_me/
[9] https://www.theregister.com/2026/01/12/who_me/
[10] mailto:whome@theregister.com
[11] https://whitepapers.theregister.com/
He should Tesco and get some sleep!
After working Aldi I'm not surprised he was too tired to stop the mistake.
It was only a Lidl mistake but it had huge consequences!
This wasn't a mistake ..... it was an M&S mistake!
There should have been a Safeway to do this.
The in-house tech should have been more co-operative
I'm pleased Tom Waitrose to the occasion
Definitely, such a Kwiksave
Morrisons to double check before you hit return.
Ah, the old "rm -rf *" command
Very much the nuclear option (hence the icon). I always triple check whether I am logged into the right machine, and in the right directory, and have a back-up of said directory, before hitting the nuke button. And even then I feel nervous about pressing it.
Re: Ah, the old "rm -rf *" command
Had a developer screw a system up by writing a script that used a parameter something like:
rm -rf $MYPATH/*
Luckily he wasn't root, but it caused enough problems that it took a while to get everything back up and running.
Only for him to tell me "I only did this" and proceeded to DO IT AGAIN.
I've also seen people do the same thing with a chmod, but as root.
Re: Ah, the old "rm -rf *" command
Actually I can't remember ever having used it. Early on I received dire warnings about my life expectancy from an experienced dev.
Rename the directory, we are going to need it later. To delete it.
Provided it isn't linked somewhere, although that can be addressed. Otherwise, definitely best practice. Have one of these --->
The secret to intelligent tinkering .... is to keep all the pieces!
Ahhhh
website_backup.tar.gz
website_backup_old.tar.gz
website-backup_old2.tar.gz
website_backup4.tar.bz
website_backup4_dave.tar.bz
You forgot one.
aaa_111_website_backup_new.tar.gz
And these:
new_website_backup.tar.gz
new_website_backup2.tar.gz
Rename the directory, we are going to need it later. To delete it.
tar - cpSf /somewhere/safe/precious.tar.gz ./precious
mv ./precious ./precious.bak
My motto: " Make your mistakes slowly, keep them small and easily reversible, if possible. "
Applies equally to powertools which is why I eschew them for their manual predecessors.
Qui festinat res destruit.
dd
dd if=/dev/zero of=/dev/sda bs=16384 [Enter]
Time passes ... "Hmmm .... that's taking a long time for such a low-capacity flash drive." ... I idlely noted the furious blinking of an access LED inside the hard drive bay ... inside the hard drive bay!
^C^C^C^C^C
Too late. I did not have a backup of the now-zeroed boot/OS drive.
Fortunately, I had, elsewhere, a copy of the response file and scripts I used to install the OS.
Afterward, I built a PC with dual DVD burners, and some removable drive bays, just for such operations. The OS ran from a DVD, to minimize the consequences of such future mistakes.
Re: dd
When I used to name old HDDs, I always used to unplug the new discs and boot from a CD
Re: dd
When ddi-ing an image onto a USB drive there's always a heart-stopping second or ten (at least that;s what it feels like) between pressing return and the LED on the drive starting to blink.
A very angry customer phoned our MD to say our software had caused a major incident; machinery had failed to operate and the consequences were serious. Not life-threatening but the local MP was involved. We were summoned immediately to site to meet with the end-user, a representative from the EA, the consultant, the main contractor and the leader of the Parish Council..... As the person considered best to offer as a sacrifice, (longest in the tooth), I was sent to investigate and download our logging/monitoring data. Tempers had eased by the time I got there but the tension was palpable.
Our logging device was simple but the data was not easily retrieved. Once in the software, the keystrokes to save or delete were easily confused...... as from experience, I knew...... Under watchful eyes, I carefully downloaded the machine history under the watchful eye of the consultant and gave him a copy of the unabridged data.
It was blindingly obvious what had happened; the machines had not been switched to 'Auto' and so were effectively switched 'Off'. Cue huge embarrassment to all (others) involved. We thought it prudent not to invoice for our call-out.
I still worry what would have happened if I had...... No I don't want to think about it ----->
Delete
Never delete.
There should be absolutely no need to delete. Ever.
You rename, copy it somewhere else, or you move it (being careful that you're not moving it over the top of something else).
You can even move it off to backup / temporary storage.
But there's no need to delete.
No need to delete emails. No need to delete anything running in production. Ever.
The only place you ever delete from is your years-old backups of things that aren't even used any more because you've been moving them further and further and further from your production systems and live backups until by then you're certain they will never be needed, and you've already copied them to some archival-type backup. Those are the only things you ever delete. Things you haven't touched in 10 years.
Honestly, I think the delete command should just be removed from users. We kind of did this with Recycle Bin, etc. but there's no need for them to actually delete anything on a managed system. And there's no need for me, the person managing that system, to delete anything that's in production, ever.
You just keep moving it around until it's clear that it's NOT used by anything (because it would have broken a dozen times already by then), then you do a final move off the system.
You don't ever need to delete.
Definitely not with a wildcard. Definitely not with a -f. Definitely not with some dumb tool to execute the same command across dozens of servers.
And if your production system and its various storages, etc. "doesn't have room" for something... well, that's a problem in itself if it ever comes to restoring those systems because you just don't have the elbow-room to manoevure and verify as you go.
Deleting is setting things on fire. Moving them is putting them on the side, then putting them in the loft, then taking them out of the loft and putting them in storage, then putting that stuff that's been in storage for years untouched into a bin, then putting that bin into the outside bin, then actually letting the bin be taken away.
Honestly, so many people who administer systems have such a blasé attitude to actually handling data, it's taken me years to drum it into those people who work with/under me.
"I'll just delete..."
"No you won't. You'll rename it and move it out of the way."
"WHY!?!"
"Do you see why now? Now just copy back what you had there originally and do it again. And COPY it back. Keep that clean copy clean and out of the way of what you're doing."
"But don't we have backups?"
"Yes. And I hope never to ever use them, and you should have zero reliance on them for what you're doing. Even before you started this, *I* copied that folder somewhere where you can't see, in case it went wrong, purely because I don't ever want to have to restore from an official backup."
Re: Delete
TLDD - (Too Long, Don't Delete!)
Tom Asda be lucky it came up again so quickly