The Aftermath of CrowdStrike: Observations and Lessons Learned
CrowdStrike, a global leader in endpoint security, incident response and cybersecurity, recently deployed an update to its Falcon sensor for Microsoft Windows systems. This update, designed to improve detection of new threats, inadvertently caused significant disruptions to the Windows operating system, leading to widespread crashes and system instability.
Notably, Mac and Linux operating systems were not affected by this problem.
VP for Portfolio and Product Strategy, Instructor & Author at Infosec.
What happened?
Despite the concerns, it is important to clarify that this incident was not the result of a hack, a security breach, or a malicious attack. Here are three key factors that led to the CrowdStrike chaos:
Faulty internal update: The problem was caused by an internal update error and not by external manipulation.
Increased privileges: As a security software, CrowdStrike Falcon has high privileges and can be integrated with the Microsoft Windows kernel.
Global impact: The impact was particularly severe because CrowdStrike’s software is deeply integrated into the critical infrastructure of major corporations and government agencies.
While this integration was essential for detecting and neutralizing high-level threats, it also meant that when the faulty update was rolled out, it led to immediate and widespread disruption.
The impact
CrowdStrike is widely used by businesses and state, local, and federal government agencies, so the scale of the disruption was enormous. For example, Delta Airlines has retained high-profile attorney David Boise as it faces potential losses of more than $300 million as a result of the incident. While many other organizations of similar size recovered within hours, Delta experienced extended operational disruptions that lasted several days, sparking industry debate over whether the failure was due to CrowdStrike’s update or Delta’s recovery plan and preparedness.
This incident caused perhaps the largest technology failure on record to date, caused by a misconfiguration or bug, with damage estimated to be in the billions – and that figure continues to rise. The consequences were enormous: thousands of flights were delayed or canceled, reservation systems worldwide were shut down and a cascade of global disruptions was caused. At least 8.5 million computers were affected, leading to unprecedented operational chaos
It is indeed ironic that CrowdStrike, a company known for its expertise in incident response, found itself at the center of such a major episode. This event underlines the complexities and challenges that even the most reputable companies can face, as well as the recovery plans and preparedness to respond.
Response from CrowdStrike
In light of this unprecedented incident, CrowdStrike responded with swift and decisive action. The company quickly came up with a solution to address the problem and subsequently released a statement with a series of commitments to prevent a recurrence. While the list of actions was thorough and comprehensive, much of it aligned with existing industry standard practices. However, CrowdStrike has notably pledged to overhaul its update deployment processes, a crucial change that is expected to improve the reliability and security of future updates.
Observations and lessons learned
The CrowdStrike outage serves as a reminder for organizations of all sizes to review their processes and ensure steps have been taken to help limit the impact of future incidents. Don’t just have a plan, but have it tested for functionality.
Among the action steps organizations should take are:
1. Have Robust Backup and Disaster Recovery Plans: It seems simple, but it’s critical to have well-defined backup, business continuity, and disaster recovery plans. Equally important is regularly testing these plans through actual walkthroughs to ensure they function effectively when needed.
2. Be careful with privileged software: Any software with privileged access to your systems has the potential to cause significant disruption. While this incident was not a security breach, it is a stark reminder that even security tools can introduce vulnerabilities. Security tools, like any software, can be a source of breaches or downtime, as evidenced by this CrowdStrike incident.
3. Ensure increased vigilance during disruptions: Large-scale outages create an attractive opportunity for attackers. Amid all the noise and disruption, malicious actors can easily slip in unnoticed and steal data. It is essential to maintain a heightened security awareness during such events to prevent opportunistic exploitation.
4. Avoid knee-jerk reactions: While the instinct may be to switch suppliers after an incident like this, it’s important to proceed with caution. Rapid, unplanned changes can lead to even bigger problems. Any transition to a new supplier should be approached as a phased project, not an overnight swap. This is especially critical for organizations that process sensitive data, such as those involved in national security.
In conclusion, the CrowdStrike incident highlights the importance of robust systems, prudent planning, and the willingness to respond to even the most unexpected challenges.
This has become a reminder that in cybersecurity, even the leaders in the field are not immune to significant disruptions, nor are they immune from causing them – but being prepared for when they may occur can be the difference between a quick fix and loss. of business.
We have offered the best IT infrastructure management service.
This article was produced as part of TechRadarPro’s Expert Insights channel, where we profile the best and brightest minds in today’s technology industry. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing, you can read more here: