A vendor known as CrowdStrike, which provides computer security monitoring services, pushed what later turned out to be a buggy update to its software on Microsoft's Cloud.
And the entire world went *BOOM*.
From the article: "Millions of people outside the IT industry are learning what CrowdStrike is today, and that's a real bad thing. Meanwhile, Microsoft is also catching blame for global network outages, and between the two, it's unclear as of Friday morning just who caused what.
After cybersecurity firm CrowdStrike shipped an update to its Falcon Sensor software that protects mission-critical systems, blue screens of death (BSODs) started taking down Windows-based systems. The problems started in Australia and followed the dateline from there.
TV networks, 911 call centers, and even the Paris Olympics were affected. Banks and financial systems in India, South Africa, Thailand, and other countries fell as computers suddenly crashed. Some individual workers discovered that their work-issued laptops were booting to blue screens on Friday morning. The outages took down not only Starbucks mobile ordering, but also a single motel in Laramie, Wyoming.
Airlines, never the most agile of networks, were particularly hard-hit, with American Airlines, United, Delta, and Frontier among the US airlines overwhelmed Friday morning."
Airlines and airports around the world, over 2,000 flights delayed or cancelled in the USA from 4am to noon Eastern time today. Hotels. 911 and emergency services phone numbers. Hospitals and medical practices. I'm sure some government operations, including local and state and federal. News broadcasters. There's probably people buying cloud services from third parties who didn't know the services they were using were tied to CrowdStrike and Microsoft that are down.
Here's the thing. From the NBC article, "CrowdStrike, which provides cybersecurity services and software for many large corporations that use Microsoft systems..." This is a single software vendor that tons of other software vendors rely upon, all of them providing service through Microsoft's Cloud. Microsoft does not scan the operations of vendors on their cloud to see whether or not their software services work correctly, that's an impossible task. They can watch for things like if a particular machine is maxing out CPU or network connections, indicating a problem, and throttle it or shut it down, and notify the people who bought that service. I don't think there's much that MS could have done in this situation.
A patch fixing the bug has been pushed, which has recovered some systems, but invariably when something like this happens, some systems cannot recover on their own and require hands-on by a tech, and in some cases computers or servers crash in horrible ways and need serious work to get them going again. Or a system might have had one or more marginal components, and it was just waiting for such a crash to fail utterly and will have to be serviced or replaced. And if that is a critical system, you know there will be major hair pulling.
It's going to be a very bad day in IT Land today.
Never forget: all 'The Cloud' means is somebody else's servers. There's nothing magic about it, it is quite capable of having tremendous security problems, and as shown, program bug problems.
https://arstechnica.com/information-technology/2024/07/major-outages-at-crowdstrike-microsoft-leave-the-world-with-bsods-and-confusion/
https://www.nbcnews.com/news/us-news/mass-cyber-outage-airports-businesses-broadcasters-crowdstrike-rcna162664
And the entire world went *BOOM*.
From the article: "Millions of people outside the IT industry are learning what CrowdStrike is today, and that's a real bad thing. Meanwhile, Microsoft is also catching blame for global network outages, and between the two, it's unclear as of Friday morning just who caused what.
After cybersecurity firm CrowdStrike shipped an update to its Falcon Sensor software that protects mission-critical systems, blue screens of death (BSODs) started taking down Windows-based systems. The problems started in Australia and followed the dateline from there.
TV networks, 911 call centers, and even the Paris Olympics were affected. Banks and financial systems in India, South Africa, Thailand, and other countries fell as computers suddenly crashed. Some individual workers discovered that their work-issued laptops were booting to blue screens on Friday morning. The outages took down not only Starbucks mobile ordering, but also a single motel in Laramie, Wyoming.
Airlines, never the most agile of networks, were particularly hard-hit, with American Airlines, United, Delta, and Frontier among the US airlines overwhelmed Friday morning."
Airlines and airports around the world, over 2,000 flights delayed or cancelled in the USA from 4am to noon Eastern time today. Hotels. 911 and emergency services phone numbers. Hospitals and medical practices. I'm sure some government operations, including local and state and federal. News broadcasters. There's probably people buying cloud services from third parties who didn't know the services they were using were tied to CrowdStrike and Microsoft that are down.
Here's the thing. From the NBC article, "CrowdStrike, which provides cybersecurity services and software for many large corporations that use Microsoft systems..." This is a single software vendor that tons of other software vendors rely upon, all of them providing service through Microsoft's Cloud. Microsoft does not scan the operations of vendors on their cloud to see whether or not their software services work correctly, that's an impossible task. They can watch for things like if a particular machine is maxing out CPU or network connections, indicating a problem, and throttle it or shut it down, and notify the people who bought that service. I don't think there's much that MS could have done in this situation.
A patch fixing the bug has been pushed, which has recovered some systems, but invariably when something like this happens, some systems cannot recover on their own and require hands-on by a tech, and in some cases computers or servers crash in horrible ways and need serious work to get them going again. Or a system might have had one or more marginal components, and it was just waiting for such a crash to fail utterly and will have to be serviced or replaced. And if that is a critical system, you know there will be major hair pulling.
It's going to be a very bad day in IT Land today.
Never forget: all 'The Cloud' means is somebody else's servers. There's nothing magic about it, it is quite capable of having tremendous security problems, and as shown, program bug problems.
https://arstechnica.com/information-technology/2024/07/major-outages-at-crowdstrike-microsoft-leave-the-world-with-bsods-and-confusion/
https://www.nbcnews.com/news/us-news/mass-cyber-outage-airports-businesses-broadcasters-crowdstrike-rcna162664