CrowdStrike Conducts External Investigation After Global Outage
CrowdStrike, the American cybersecurity firm, caused a global outage on July 19 after an update caused Windows laptops and desktops to crash and get stuck in a boot loop. The outage lasted for several hours and affected several industries including airlines, healthcare, IT and more. After fixing the issue, the company published a post-incident report highlighting that an artificial intelligence (AI) system called the ‘Falcon Sensor’ was at fault. Now, the company has published a detailed report following a third-party review to highlight what exactly went wrong.
CrowdStrike Publishes External Assessment Report
In a report The cybersecurity firm reported that it had discovered that the Falcon sensor had used an incorrect template type string that affected Windows interprocess communication (IPC) mechanisms. The analysis was titled “External Technical Root Cause Analysis – Channel File 291.”
According to CrowdStrike, Falcon runs machine learning models that automatically identify and remediate the latest and most advanced malicious threats. Just prior to the July 19 outage, the detection functionality pushed a new “template type” to millions of customers’ Falcon installations running version 7.11.
However, this is where things went wrong. The report highlighted that the IPC template type had defined 21 input parameter fields, but “the integration code calling the Content Interpreter with the Template Instances of Channel File 291 only provided 20 input values to match.” This mismatch is generally not a problem, as the AI system has never chosen an input outside the given 20 to date.
But that day the sensor asked to inspect template type 21. Since there was no corresponding integration code for it, the attempt to access the 21st input parameter resulted in an out-of-bounds memory error and a system crash.
The report highlighted mitigation steps and claimed that CrowdStrike had developed a patch for the Sensor Content Compiler that validates the number of inputs provided by a Template Type. This went into production on July 27. The company said that it has also focused on more testing and validation before releasing an update. It also stated that all future updates will be rolled out in phases to minimize potential bugs.
It is notable that no details have been provided about the external vendors who carried out the evaluation.