How NOT to f-up your security incident response
- Reference: 1741610539
- News link: https://www.theregister.co.uk/2025/03/10/incident_response_advice/
- Source link:
Like if you completely and utterly stuff up the incident response investigation and that snafu adds millions of dollars more in damages costs to the overall bill.
In one such incident, Jake Williams, VP of research and development and cybersecurity consulting biz Hunter Strategy, says he was called in to clean up a client's hot mess of a forensics report, prompting a [1]frustrated social media post "imploring" companies: "This is NOT something you can just DIY."
[2]
The mishaps made in this investigation "are easily a seven-figure mistake," he added.
[3]
[4]
Williams, probably best known by his social media moniker MalwareJake, used to work as a US National Security Agency hacker and also serves as an IANS Research faculty member.
The errors that occurred during the incident response and which were revealed in the subsequent forensic report stem from "a big issue of confirmation bias," Williams told The Register . "The report reads like they formed a theory about what happened, and then spent a bunch of time going and searching for evidence that supported their conclusions."
[5]
He declined to name the targeted organization, confirming only that it was a Fortune 1,000 company, so "big enough that I would have expected a bit more rigor in the forensic analysis."
Both the CISO and CIO were fired over the security incident, during which digital intruders exploited a combination of SQL injection and directory traversal bugs to break in and compromise a number of servers. One of these was internet-facing.
Confirmation bias
"Occam's Razor says that [the internet-facing device] probably is the zero patient, and they scoured the logs on this public-facing server until they found something that they thought was related to or evidence of compromise," Williams said.
"The unfortunate reality is it was not the zero patient, even though it was internet-facing," he continued. "It was one the threat actor laterally moved to after having been in the network for over a month."
Williams says it took him a couple of hours of analysis to determine the internet-facing server wasn't the initial access point. "Looking at their report, it was pretty obvious that they were trying to cherry-pick a piece of data and say that is evidence of exploitation."
[6]
This particular incident highlights one of the most common mistakes that organizations make: not allocating enough time to an investigation, and failing to incorporate new evidence, according to Williams and other incident responders interviewed by The Register .
Many customers behave similar to patients who have just received a sobering medical diagnosis
And, of course, every organization and executive reacts differently upon realizing that someone has broken into their systems.
"Many customers behave similar to patients who have just received a sobering medical diagnosis," Microsoft's Director of Incident Response Ping Look told The Register .
Their reaction depends on "how prepared they are to receive the diagnosis — how much experience they may have and how much information they have available," she added. "Many organizations understandably do not know what to do or where to start."
The first challenge that most orgs face, according to Mandiant Consulting CTO Charles Carmakal, involves "not properly scoping out the investigation and being too narrowly focused."
Stop, drop and scope
This narrow focus may be a result of an exec or insurance company trying to minimize investigation costs. Or it could be due to an incorrect theory — about how or where the intruders gained access to the network, for example.
"Maybe you think the incident is limited to a particular system or environment," Carmakal told The Register . "But it could be broader. And the risk of not properly scoping out an incident is not finding backdoors or credentials that were stolen by an adversary that could be used to re-compromise the environment."
Immediately after a cyberattack, when every second counts, and teams are scrambling to understand what happened while also getting vital systems back up and running, companies commonly rush to remediate, said James Perry, CrowdStrike VP of global digital forensics & incident response.
When this happens, it's easy to miss or fail to preserve key evidence. This is understandable. "Getting back to business operations is critical," Perry told The Register . "But rebooting systems, wiping machines or making changes too quickly can erase crucial forensic data. Without it, determining the full scope of a breach becomes significantly harder, if not impossible."
Create a timeline
And don't skip out on doing an incident report, either. Not having this detailed timeline laid out in front of you, in writing, makes it really hard to identify gaps in your understanding of the attack.
"As you start to write things down on paper, you create a timeline, you create an access propagation diagram that shows how the attacker went from system A to system B to system C to all the way down to System Z," Carmakal said.
Additionally, sometimes the people who are directly involved in the incident response aren't the same one doing the hands-on remediation activities.
"Then you have a lost-in-translation situation, where, if the guidance was verbal and not written on paper, you're very likely to miss some of the important nuances," Carmakal added.
Ransomware has entered the building
Ransomware attacks present their own unique challenges and can be especially taxing for organizations struggling to recover their systems while also containing and mitigating the infection.
"The pressure is immediate. Systems are down, operations are disrupted, and leadership wants answers and to get back up and running quickly," CrowdStrike's Perry said, adding that many company don't have tested response plans or decision-making frameworks for the aftermath of these data encrypting events.
"This often leads to rushed or poorly coordinated actions, such as attempting partial restorations without fully understanding the scope of the compromise, which can cause reinfection or data loss," he explained.
Visibility can also be an issue, because most ransomware groups steal sensitive data before they lock it up, and then extort the organization to prevent it from being leaked.
"Many organizations lack the forensic capability to determine what data was exfiltrated, when, and by whom," Perry said. "Log retention issues, incomplete network monitoring, and the disabling of security tools by attackers further complicate investigations."
Without a solid grasp of the full attack chain, companies struggle to assess the true impact, notify affected stakeholders, and meet compliance requirements — all while trying to restore business operations under intense pressure
"Without a solid grasp of the full attack chain, companies struggle to assess the true impact, notify affected stakeholders, and meet compliance requirements — all while trying to restore business operations under intense pressure," he added.
A common worst-case scenario for organizations is when ransomware becomes destructive, "meaning everything within the organization's environment comes to a screeching halt," according to Microsoft's Look. "In situations like these organizations cannot conduct any business internally, nor can their customers access their accounts."
Infected companies usually don't want to pay the ransom demand, which pumps more money into the criminal ecosystem. Plus, responding to these incidents typically takes longer.
"This can keep the organization in a disrupted, frozen state that is stressful for employees, customers and investors," Look said. "In those scenarios, the organization might realize they are understaffed and do not have an updated cyber resilience plan.
IR teams don't work in a vacuum
It's also important to realize that IR teams "do not have the luxury of working in a vacuum." Boards of directors, insurance companies, the media, regulators, and even law enforcement all want status updates.
"That type of pressure creates a lot of chaos if organizations do not account for it in crisis planning and it can really impact teams' ability to prioritize where to focus their energies," Microsoft's Look said.
All of the experts interviewed for this story stressed the importance of maintaining an up-to-date and well-rehearsed cyber resilience plan when asked about the top advice they dole out to companies on how not to f-up your incident response.
They also emphasized calling in professionals and not relying on an existing IT or managed services provider if hit by a major attack.
Yes, this is self-serving, as these are all top-tier incident responders who are regularly called to investigate nation-state attacks and major ransomware infections — or to clean up messes made by earlier IR firms.
Still, there's something to be said for bringing in the big guns should you find yourself in the middle of a really bad breach. And seeing as they are experts in the field, their advice on how to screw up an incident response (IR) is solid.
'Be IR ready'
"Top advice from Microsoft Incident Response: Be IR ready," Look said, adding this means "having a current incident response plan that is both regularly rehearsed, and able to be updated. You also want to have an incident response retainer already in place, so that services you may need to navigate the incident and any potential legal, insurance and other fallout are available to you on-demand."
If your company is receiving security services from more than one vendor during an incident, ask them to share information and work together.
"Some companies think they are protecting themselves by keeping all the vendors apart, but security is a team sport," Look told us. "The more knowledge sharing and collaboration, the faster and more effective an investigation can typically take place.
She also offers some secondary advice: if you are using old, outdated systems that no longer receive security updates and vendor support, develop a plan to invest in modernization. Yes, this is expensive, but it will reduce your organization's attack surface, thus making it more difficult for digital intruders to break in, and this will save money down the road.
Also, "never waste an opportunity to think about rebuilding your systems," Hunter Strategy's Williams said. "It's fairly rare for CISOs and CIOs today to get fired over a single incident, unless there's broad incompetence that led to it."
[7]So you paid a ransom demand … and now the decryptor doesn't work
[8]Between ransomware and month-long engagements, IR teams need a hug – and a nap
[9]Like whitebox servers, rent-a-crew crime 'affiliates' have commoditized ransomware
[10]Ransomware scum make it personal for Reg readers by impersonating tech support
However, it is fairly common for organizations to find themselves compromised a second time after the first security incident, only to find out that the intruders hadn't been fully kicked off their systems in the first place. And this is a much more serious offence for the security team who handled the first incident response and mitigation efforts, not to mention the bad press and loss of brand reputation following two data breaches in short succession.
"I always try to make it personal for folks and say, hey, look, you're taking a huge risk, a personal risk and a career risk, by not rebuilding," Williams says. "People try to clean malware off of systems rather than rebuilding systems. But you just can't ever deem a system clean once a threat actor has been on it."
CrowdStrike's Perty said the most important advice he gives to companies following an incident is "slow down and take a methodical approach — even under pressure. It's natural to want to jump straight into remediation, but without a structured response, you risk destroying critical forensic evidence or missing key indicators that could reveal the full scope of the attack."
Before making any changes, be sure to capture volatile data, preserve logs and document everything, he added.
And if you don't have an IR plan in place, there's no time like the present. Just make sure it doesn't sit in a corner and gather dust. As Perry noted: "Practice makes perfect." ®
Get our [11]Tech Resources
[1] https://bsky.app/profile/malwarejake.bsky.social/post/3liu4iu235c2u
[2] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_security/cso&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2Z88atf9jyF4FcyWCI7UF2wAAAEU&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0
[3] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_security/cso&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44Z88atf9jyF4FcyWCI7UF2wAAAEU&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_security/cso&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33Z88atf9jyF4FcyWCI7UF2wAAAEU&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[5] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_security/cso&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44Z88atf9jyF4FcyWCI7UF2wAAAEU&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[6] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_security/cso&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33Z88atf9jyF4FcyWCI7UF2wAAAEU&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[7] https://www.theregister.com/2024/09/11/ransomware_decryptor_not_working/
[8] https://www.theregister.com/2022/10/03/ibm_incident_reponder_survey/
[9] https://www.theregister.com/2025/03/07/commoditization_ransomware/
[10] https://www.theregister.com/2025/01/22/ransomware_crews_abuse_microsoft_teams/
[11] https://whitepapers.theregister.com/
Or, as Arthur Conan Doyle has his famous detective, Sherlock Holmes, put it:
"When you have eliminated everything that is impossible, whatever remains, however improbable, must be the truth."
The thing is that doing an investigation means you have to look at everything, not merely what you want to examine. I cannot help feeling that in far too many disasters, 'mangelement' tries to blame 'other people' for the decisions they made that resulted in the breach / disaster / whatever.
I did a few investigations some decades ago, and it was interesting that blaming the actual 'management' culprit was rarely an option. One had to merely specify what had occurred, and not apportion blame. So, the senior manager who left his (unencrypted) company laptop on the back seat of his company Range Rover while he went for an evening meal in a restaurant, returning 2 hours later to find a broken window and an absence of said laptop (and other sensitive company papers) was, of course, not punished in any way (a few years later he was promoted). If I had done that, I'd have been sacked.
We suspect the laptop was wiped, and later re-sold, as there was no ransom demand, and the sensitive client data does not appear to have been used against the company. And, hopefully, the replacement was encrypted bore issue to him.
In the right order
" The correct approach in forensic science is form a theory and then look for evidence to disprove it. "
But only after gathering all possible evidence without any preconceptions. Forming a theory too soon is a pitfall for the unwary. The fundamental problem, though, is that IR 'forensics' are not laboratory forensics, let alone forensic science -- they're practical investigation primarily directed towards damage limitation rather than abstract analysis. So we're not really discussing forensics (a heavily misused term 1 ) here. The forensics come later, once the incident is under control.
1: strictly, forensics is the gathering and presentation of evidence "pertaining to, connected with, or used in courts of law" [OED]
.
"having a current incident response plan that is [...] regularly rehearsed
Several organisations (both international corporate and UK govt.) I have consulted with conducted "rehearsals" as sit-down sessions with an external consultant who talked the executive through some elementary scenario and asked them how they might respond. In one classic case, the scenario was " how do we evacuate the building and get staff working from home when a UXB is discovered in the next street? ", no other possible incidents being even mentioned during the session despite my attempt to make this happen (actually considered disruptive).
Many IR plans I've encountered have been restricted to addressing a limited list of 'expected' incidents, and none have been actually live tested at all. One such plan, even after multiple notional reviews (as indicated by dates on the front cover) still contained an action flow chart with an infinite loop triggered by a branch early on in the decision sequence. Apparently, nobody had ever noticed this. When I suggested to this client that there should be at least an annual unannounced incident simulation I was told that it would annoy the notional first responders to be called out at 3 AM without warning. When I gritted my teeth and further suggested that in aid of realism confusion should be intentionally injected into the simulation I became seriously unpopular.
Finally, no IR plan review panel I have encountered has included any technical staff -- it's always been the executive and senior non-technical management. So, taking all this into account, it's not surprising that incident response generally remains abysmal, as nobody seems to take it seriously until too late.
Re: "having a current incident response plan that is [...] regularly rehearsed
This is because organisations only want what they have: Sufficient pantomime to claim they have a DR/IR plan, and to claim that it has been tested.
Very, very few companies want the disruption and pain of proper DR testing, because that will reveal things that don't work and need expensive fixing. And there's a simple test as to whether these organisations actually care: Do they keep vast amounts of rarely needed but easily pilferable data on hot servers in the first place? The answer's almost always yes.
Both the CISO and CIO were fired over the security incident
The classic executive response to an incident : find a scapegoat and CYA. Now, they may well have been incompetent and deserved their fate, but the main reason for there being no comprehensive recovery plan is neatly always an unwillingness to pay for something seen as very expensive with not enough benefit for the cost.
the subsequent forensic report stem from "a big issue of confirmation bias, ...The report reads like they formed a theory about what happened, and then spent a bunch of time going and searching for evidence that supported their conclusions."
The correct approach in forensic science is form a theory and then look for evidence to disprove it. The harder you look and fail the more likely the theory is to be correct.