Article

West Monroe's take on the CrowdStrike-Microsoft outage

July 19, 2024

A recent CrowdStrike update caused widespread disruptions, particularly affecting systems running on Microsoft infrastructure. This incident impacted various services, including point-of-sale systems, authentication, and communication technologies. Organizations with systems dependent on Microsoft and lacking appropriate redundancy faced significant challenges.

Scale & Impact

The disruption was widespread, affecting many industries relying on Microsoft systems. Critical services such as point-of-sale devices, authentication systems, and core communications were impacted. Systems dependent on Microsoft infrastructure without proper redundancy were the most vulnerable.
The issue originated from an update by CrowdStrike that interfered with the booting process of Windows systems.
The Azure central region was specifically mentioned as being impacted; organizations with systems deployed in different regions within Azure were less affected.

Response & Mitigation

Larger organizations with robust IT teams and automated tools were able to address the issue more swiftly than smaller organizations, or those without sufficient IT resources, which are facing prolonged disruptions.
CrowdStrike provided guidance on removing a specific file to resolve the issue, which larger organizations could implement quickly.

West Monroe’s Long-Term Recommendations

The incident underscores the vital importance of robust software quality testing and meticulous release processes. A balanced approach between fast releases and thorough testing is essential to prevent significant disruptions.

Enhance Redundancy. Ensure systems have appropriate redundancy and are not solely dependent on a single provider or technology.
Strengthen IT Infrastructure. Invest in robust IT infrastructure and teams capable of responding to widespread issues swiftly. Utilize automated tools for managing and deploying updates across systems.
Prioritize Quality Assurance. Implement comprehensive software quality testing and meticulous release processes. Focus on customer satisfaction and reliability rather than solely on cost-saving measures.
Plan for Incident Response. Develop and maintain an incident response plan that includes steps for quickly addressing and mitigating disruptions. Ensure all end points and distributed systems can be managed efficiently during such events.
Collaborate and Communicate. Maintain clear communication channels with technology providers like CrowdStrike to receive timely updates and guidance. Foster collaboration between IT teams and service providers to address issues effectively.

Services

Industries

Insights

About

Careers

West Monroe's take on the CrowdStrike-Microsoft outage

Scale & Impact

Response & Mitigation

West Monroe’s Long-Term Recommendations

Let's Connect

Be Part of Our Team

Stay in the Know