The devastating IT issue caused by a faulty update pushed to Windows PCs by cyber security company CrowdStrike is continuing to weak havoc days later. The problem first came to global attention early Friday morning, knocking companies and banks offline and grounding planes at various airports. While the initial bug was identified and fixed that same day, the knock-on effects are continuing.
CrowdStrike is used by businesses worldwide, including banks and airports, and the fault occurred due to an update error involving its Falcon Sensor software. When deployed automatically to millions of PCs around the world, it inadvertently put them into a recovery boot loop. The resulting Blue Screen of Death (BSOD) began to appear worldwide and knocked countless systems offline.
While CrowdStrike implemented a fix to the fault, this will only stopped more machines from crashing. It couldn't help those already affected which, according to Microsoft, was about 8.5 million devices worldwide.
Now the cleanup operation is in full swing as businesses look to recover their lost systems and airlines try to get customers to their final destinations.
Our live blog below shows the updates as they happened through Friday (July 19) and into the aftermath of what's being called the biggest IT outage in history.
Global Windows IT issue: What we know
A substantial global IT issue knocked Windows PCs offline for banks, broadcasters, airlines, health clinics and more.
- Australian companies first reported the error after broadcasters and banks could not load their Windows machines. Greeted by the Blue Screen of Death (BSOD), the devices failed to boot properly, knocking them offline.
- As Europe began to start the work day, it became clear this was a widespread global issue affecting multiple industries. Across the world, flights were grounded due to the error. In the U.K., broadcaster Sky News was unable to broadcast its news bulletin, and clinics were unable to book patient appointments.
- A faulty update caused the issue pushed through from cybersecurity firm CrowdStrike, which meant affected devices could not load correctly.
- At 1:20am ET on Friday, July 19, CrowdStrike issued a support note saying it had identified and reverted the issue. However, this only prevents more machines from hitting the BSOD and can't recover those already affected.
- Microsoft, one of the largest affected companies, appeared to have suffered a separate outage that mainly affected Microsoft 365 apps and services due toa configuration change in its backend Azure settings. The company says it has now fixed these.
- CrowdStrike CEO George Kurtz posted to X at 5:45am ET/10:45am BST that the issue was caused by a "single update for Windows hosts" and that the "issue has been identified, isolated and a fix has been deployed." However, the knock-on effects are continuing to cause mass disruption.
- Microsoft's Satya Nadella commented on the issue stating that Microsoft is aware of the situation. "We are aware of this issue and are working closely with CrowdStrike," reads his X post.
- CrowdStrike posted a technical breakdown of what happened. It's a lot of information, but if you want to know more than "it was a software update," it's worth reading.
- Airlines are trying to recover, but passengers are reporting long wait times at airports to get on their flights.
- Thousands of flights were canceled on Saturday worldwide, and more than 1,000 flights have been canceled as of 8:30am ET.
- As of Monday, July 22, Microsoft estimates that as many as 8.5 million Windows PCs were affected by the faulty update.
What is CrowdStrike?
CrowdStrike is a global cybersecurity company that proudly declares in its X profile, "We Stop Breaches." It offers threat intelligence and protection from cyber attacks to a range of large companies, including Microsoft and many large airlines.
Founded in 2011, the publicly traded company has also led several high-profile investigations into cyber espionage attacks, including against Sony Pictures and the Democratic National Committee.
It produces security software for Windows servers and machines that are designed to detect and prevent attacks, including its Falcon Endpoint and identity protection platform.
This comprises various modules, including ones that track system vulnerabilities and others that sandbox malware. Falcon is also widely used on public sector infrastructure and in data centers such as those powering Microsoft Azure and Microsoft 365 services.
Falcon Sensor is one of the modules in the Falcon platform designed to prevent cyber attacks, and an update to this module triggered the global outage.
Refresh
Microsoft working to fix "Service Degradation"
Despite the chaos unfolding across the world, Microsoft is working quickly to fix the "service degradation" it notes on its cloud status page. An update at the top of the page reads: "Users may notice that some of the affected users are seeing relief as we continue to mitigate the impact."
According to Microsoft, the following services should be working normally.
- Microsoft Defender
- Microsoft Defender for Endpoint
- Microsoft Defender Experts
- Microsoft Intune
- Microsoft OneNote
- OneDrive for Business
- SharePoint Online
- Windows 365
- Viva Engage
- Microsoft Purview
CrowdStrike update takes out large parts of the web
Updates from cybersecurity company Crowdstrike are the most likely cause of the global IT outage that has taken parts of Microsoft Azure and 365 offline, leaving individuals and companies unable to offer services.
We’ve seen hits to the NHS in the U.K., TV news stations including some Fox affiliates and Sky News in the U.K. and Australia taken off air and banks unable to provide services. There have also been transport issues with flights unable to take off and trains facing delays.
Microsoft says it has applied fixes to Azure and other platforms and things are starting to return to normal, but says some users will experience disruption throughout the day.
Flights grounded due to CrowdStrike fault
The Federal Aviation Administration says all flights from United, American Airlines and Delta have been grounded due to a "communication issue" which Delta and United have confirmed is linked to the global outage.
A United spokesperson said in a statement: "While we work to restore those systems, we are holding all aircraft at their departure airports. Flights already airborne are continuing to their destinations."
The Microsoft / CrowdStrike outage has taken down most airports in India. I got my first hand-written boarding pass today 😅 pic.twitter.com/xsdnq1PgjrJuly 19, 2024
Berlin Airport in Germany is warning of major delays and RyanAir, Europe's largest airline, says a global third-party IT outage has caused disruption across the entire network. Delhi Airport in India has gone completely manual, writing out baggage tags and boarding passes.
CrowdStrike has a fix for Blue Screen of Death
Here is the solution for the @CrowdStrike Issue guy !!#csagent #bsod #crowdstrike #windowsissues #Windowsdown pic.twitter.com/XmajoqQpFlJuly 19, 2024
On the consumer side of things, Windows computers were being served a blue screen of death due to a global CrowdStrike issue. These crashes were due to a “Falcon Sensor” issue — ironically this is the software that’s supposed to defend computer systems from crashing due to cyber attacks.
In the past hour, Crowdstrike came out with a resolution if you’re still seeing this issue.
That should fix it, but if you’re still seeing issues, pipe up in the comments! Now for the world’s businesses…
911 emergency response affected in the United States
The list of companies being impacted is largely just a bunch of annoying inconveniences, such as Xbox Live being down for a bit (it’s back up) and Microsoft 365.
But there are some real scary consequences of this too. Namely, 911 emergency response is being hit hard across the US. According to Down Detector, we’re seeing big dropouts in the following states:
- New York
- Washington
- Atlanta
- Florida
- Texas
- Arizona
- California
- Missouri
- Michigan
- Illinois
Fortunately, this outage seems to be on the decline, as the number of people reporting has declined. But we’ll keep a close eye on this.
U.K. health service impacted by outage
People in the U.K. are seeing their national health service (NHS), unable to take appointments due to problems with their systems as a result of the faulty update.
So far, we're seeing clinics in Yorkshire, Cheshire the West Midlands and Chorley unable to take any appointments. The NHS has been affected by systems crashes before and, along with the immediate impact, there's often a backlog that can be caused in the aftermath.
An NHS spokespersonsaid:“The NHS is aware of a global IT outage and an issue with EMIS, an appointment and patient record system, which is causing disruption in the majority of GP practices.
“The NHS has long standing measures in place to manage the disruption, including using paper patient records and handwritten prescriptions, and the usual phone systems to contact your GP. There is currently no known impact on 999 or emergency services, so people should use these services as they usually would."
DownDetector gives eye-opening view of affected services
Everybody is talking about this being a global IT issue, but to get a true view of just how many services, head over to Down Detector and just look at those spikes!
Here is just a snippet of the companies seemingly impacted by this global IT outage (outside of Microsoft):
- BetMGM
- Amazon
- Xfinity by Comcast
- Delta Airlines
- Bank of America
- Visa
- United Airlines
- Apple Support
- PlentyOfFish
'Impossible to simulate the size and magnitude of the issue'
Cybersecurity experts have warned that while this isn’t a cyber attack, it does highlight the potential risks to the global economy as well as the impact on individual lives in the event of a major IT outage.
Jake Moore, Global Cybersecurity Advisor at ESET and a former Police Head of Digital Forensics in the U.K. told Tom’s Guide people are often quick to suspect a cyberattack but this adds to the confusion, highlighting “the importance of these services and the millions of people they serve.”
He told us: “Businesses must test their updates and infrastructure and have multiple failsafes in place, however large the company is. But as often it is with the case, it is simply impossible to simulate the size and magnitude of the issue in a safe environment without testing the actual network.”
Moore says the impact and inconveniences seen during this recent outage to services for thousands of people “serves as a reminder of our dependence on Big Tech in running our daily lives and businesses. Upgrades and maintenance can make systems and networks more vulnerable to small errors, which can have wide-reaching consequences as demonstrated today.”
JFK airport affected by outage
Passengers at JFK airport are currently being kept waiting due to the ongoing IT issues. According to one of my Tom's Guide colleagues, while he was able to check his luggage that's as far as he got — and is simply "standing in a queue with lots of other people."
Screens at the airport show the Windows recovery message that the system failed to load properly. While the airport isn't really busy yet, due to the hour, it could be a very different story in a few hours' time. There are already lines forming at the American Airlines bag check.
And, unfortunately for travelers, there's no telling when operations will be back to normal.
CrowdStrike CEO issues statement
George Kurtz, CEO of CrowdStrike, has issued a statement to say his company is working with customers to restore systems.
Kurtz wrote: "CrowdStrike is actively working with customers impacted by a defect found in a single content update for Windows hosts. Mac and Linux hosts are not impacted. This is not a security incident or cyberattack. The issue has been identified, isolated and a fix has been deployed.
"We refer customers to the support portal for the latest updates and will continue to provide complete and continuous updates on our website. We further recommend organizations ensure they’re communicating with CrowdStrike representatives through official channels. Our team is fully mobilized to ensure the security and stability of CrowdStrike customers."
CrowdStrike is actively working with customers impacted by a defect found in a single content update for Windows hosts. Mac and Linux hosts are not impacted. This is not a security incident or cyberattack. The issue has been identified, isolated and a fix has been deployed. We…July 19, 2024
This one isn't easy to fix
We've had plenty of internet outages in recent years, but fixing this one will take a long time.
System administrators warn this won't be an easy problem to fix and will require a "human visit to every machine". Anonymous X account SwiftOnSecurity, run by a former helpdesk engineer, says fixing it will require technicians to take a USB stick to reboot every machine including those being used by remote workers.
It is likely companies will just send out new laptops to some employees as it will be quicker than trying to fix the existing ones. So, even after they get core services restored, the disruption could continue for some time.
Just to be clear, fixing this CrowdStrike issue will require basically a human visit to every machine. Some of the machines will not be able to get into the recovery environment, and require a USB stick boot. Centrally fixing this is not possible it happens before anything loads.July 19, 2024
AWS is also affected
The repercussions of the outage are spreading to other platforms with Amazon Web Services (AWS) also reporting issues.
"We continue to work on resolving the connectivity issues and reboots of Windows Instances, Windows Workspaces and Appstream Applications related to a recent update to the Crowdstrike agent (csagent.sys), which is resulting in a stop error (BSOD) within the Windows operating system," the company wrote.
The company recommends three different ways for customers to attempt to resolve the issue, including rebooting EC2 instances from "a snapshot or image taken before 9:30 PM PDT".
However, it says that its own products remain stable, "AWS services and network connectivity continue to operate normally," the company said.
Global fitness firm F45 affected
Although not as critical as some of the other businesses impacted by today’s outage, booking systems for fitness centers — including global workout brand F45 — have been taken down too. According to a statement posted to Instagram by F45 Clapham Junction, the London-based studio plans to work around the booking system troubles by allowing anyone who wants to attend to drop in.
However, it doesn’t expect to be overwhelmed by demand as, before the outage, there were no waiting lists for any of today’s classes. But that probably won’t be the case at many of the brand’s over 2,000 studios, which are all independently-run franchises.
Here's the advice from James Frew, Fitness Editor here at Tom's Guide: “The problems we’ve seen today at F45 will affect many fitness centers and gyms, but if you can’t make it to your usual class, you do still have options. It’s not a like-for-like alternative, but the workout app Fiit offers free access to all of its virtual group classes, and you can even join with friends, so it’s a good option if you still want to train."
Reddit provides some insight
Thousands of system administrators have (predictably) flocked to Reddit to share woes of tackling the ongoing IT outage.
A highlight of the thread is one user stating: “Posting here to be part of this historic thread. The day that Crowdstrike took out the internet!”
The thread gives an indication of why this is such a big problem with another user talking about the need to restore thousands of devices and connections even after a fix is issued.
“I am sure even the most knowledgeable and resourceful hacking groups couldn't cause a disruption and damage of this magnitude,” a user wrote. “We have hundreds of Windows servers and thousands of Windows workstations affected by this.”
No-fly-zone
Airports around the world are taking no chances and are continuing to ground and delay flights while engineers try to recover their affected systems. Meanwhile, passengers are forming ever-longer lines waiting for a resolution.
The FAA listed a "communication issue" as a reason for stopping flights from Delta, United and American Airlines. Meanwhile, airports in New York, Berlin, London and Delhi are reporting delays but continue to say customers should arrive at their scheduled check-in time.
Amusingly, Delhi Airport in India has gone completely manual, writing out baggage tags and boarding passes.
The Microsoft / CrowdStrike outage has taken down most airports in India. I got my first hand-written boarding pass today 😅 pic.twitter.com/xsdnq1PgjrJuly 19, 2024
"Biggest IT fail ever"
The scale of today's problem needs no introduction, but we're still a long way from finding out exactly how bad the long-term ramifications are.
SpaceX and X CEO Elon Musk called it the "biggest IT fail ever".
Biggest IT fail everJuly 19, 2024
Other business leaders say this is an important lesson in researching and vetting the cybersecurity solutions they employ.
"CrowdStrike’s platform approach, which relies on a single agent focused on detection, might seem good at first glance, but as we can see, it can create significant issues," said Al Lakhani, CEO of IDEE
"For instance, agents require installation and maintenance of software on multiple different OSes, adding layers of complexity and potential points of failure. Moreover, agents can become a single point of failure, as a bad update can compromise the entire network, as seen with the SolarWinds attack."
Could AI have prevented this?
The single biggest trend across the tech industry over the last year and a half has been AI, and CrowdStrike is no exception. The company has several AI solutions in place, including a generative AI for cybersecurity called Charlotte.
CrowdStrike's sensor platform takes data from devices across a network and uses machine learning to identify threat activities. In this case an update to the sensor software seems to have taken some of the largest networks offline.
So will more AI involvement in the future stop this kind of thing from happening again? Here's what Ryan Morrison, Tom's Guide's AI Editor said: "While software bugs or bad code are nothing new, and can cause significant problems for a company they are becoming easier to spot before deployment.
"AI coding tools make testing and simulating different scenarios faster and cheaper, and it could be deployed to spot issues in a live environment before the code is too widely spread.
"If they don't already, I suspect CrowdStrike, Microsoft and others will be exploring ways to use AI to monitor for unexpected behaviour in tools like Falcon Sensor and any other update to code, flagging a takedown and stopping the update before too many machines are impacted.
"In this case though, it seems human involvement performed that task, with CrowdStrike spiking the update relatively quickly — just not fast enough."
What is CrowdStrike?
The business at the epicenter of today's global outage is cybersecurity firm CloudStrike, which produces security software for Windows servers. We've got a full explainer here on what the company is and what it does.
CrowdStrike proudlydeclares in its X profile"We Stop Breaches." A faulty update sent to its platform Falcon Sensor (specifically designed to prevent attacks on a machine) is what triggered the outage. It counts Microsoft and many of the big airlines among its clients.
MacOS and Linux unaffected by outage
Despite the worldwide problems caused by today's outage, not everyone will be affected. As confirmed by CrowdStrike's CEO, the issue was caused by a "single update for Windows hosts" — and therefore only affects Windows PCs. That means any company operating on Apple's macOS or, in fact, a Linux distribution like Ubuntu, won't have been caught out.
And while it's unlikely that organisations around the world will suddenly drop Microsoft's OS in favor of Apple's, today's events aren't a good look for the Windows brand.
"We're deeply sorry"
George Kurtz, co-founder and CEO of CrowdStrike has apologised for the damage caused by today's outage during an interview with NBC News.
"We're deeply sorry for the impact that we've caused to customers, to travellers, to anyone affected by this, including our companies," Kurtz told the broadcaster.
"It could be some time for some systems that just automatically won't recover, but it is our mission... to make sure every customer is fully recovered."
Microsoft: "several reboots" may be required to fix
Although Microsoft was quick to point out that today's crash was caused by a "third-party", the company is obviously in damage control mode. Microsoft has been affected not only by the CloudStrike issue but also a separate problem affecting Azure which took out the likes of Microsoft 365 apps.
According to the Azure status page, the company says it has been told by customers that rebooting virtual machine reboots can form an effective troubleshoot.
"We have received reports of successful recovery from some customers attempting multiple Virtual Machine restart operations on affected Virtual Machines," the page states.
"We've received feedback from customers that several reboots (as many as 15 have been reported) may be required, but overall feedback is that reboots are an effective troubleshooting step at this stage."
So, if in doubt; turn it off and turn it on again. Fifteen times.
How did this happen?
We've all heard of updates introducing bugs and issues to our gadgets, but it's exceptionally rare to see something of this scale. If you're not familiar with CrowdStrike, it's a big player in the cybersecurity field with an extensive list of clients. Add to that the fact Windows is still the most-used OS across the world and you can see where this is going.
Because new cyber threats are emerging all the time, products like the Falcon Sensor are given auto-update privileges across organisations. They need to be able to push new updates to PCs without having human oversight. Furthermore, they have broad-reaching control over machines in order to detect and mitigate risks. So if something goes wrong, it can effectively shut down the entire machine.
I don’t think it’s too early to call it: this will be the largest IT outage in historyJuly 19, 2024
Delta issues travel waiver for passengers
Delta says it has resumed some flight departures but delays and cancelations are inevitable following the impact to its global flight schedule.
The airline says the delays are likely to continue well into the day and has issued atravel waiverfor all customers with booked flights departing today, Friday, July 19. The waiver lets passengers manage their own travel changes via delta.com and the Fly Delta app.
"The fare difference for customers will be waived when rebooked travel occurs on or before July 24, in the same cabin of service as originally booked," Delta said. "If travel is rebooked after July 24, any difference in fare between the original ticket and the new ticket will be collected at the time of booking."
How to boot Windows 10 to safe mode
If you're still using a Windows 10 machine and are having trouble with getting it to boot, one thing you'll want to know is how to boot into Windows 10 safe mode.
Safe mode is a basic state, which uses only a small set of files and drivers. It's an ideal way to get into your system and repair the problem when other methods have failed.
Here's how to do it:
- Open Settings from the Start Menu or by pressing Windows + I. The Settings Menu is super easy to access from the Start Menu, which is located on the toolbar, just click the Windows logo in the corner and then look for the little cog symbol. Alternatively, press Windows + I to bring up the Settings Menu.
- Select Updates and Security from the Settings Menu. If you can’t find Updates and Security in the Settings Menu then there’s a handy search bar that you can use to locate it.
- Open the Recovery tab on the Updates and Security Menu. On the left-hand column of the Updates and Security Menu you’ll find the Recovery tab. If you’re struggling to locate it make use of the search bar.
- Under Advanced startup, select Restart Now. Hit the Restart Now button which is located under the Advanced start-up header. Make sure you’ve saved anything you were working on beforehand though.
- Select Troubleshoot.
- Select Advanced options.
- Select Startup Settings.
- Select Restart. After your device restarts, you’ll be faced with a ‘Choose an option’ menu, follow the steps above, first hit Troubleshoot, then Advanced options, then startup settings, and finally restart. This will again restart your device.
- Press F4 from the Startup Settings menu. After your Windows 10 device restarts, you’ll be faced with a numbered list of options, you want number 4. This will boot your PC into safe mode. If you need networking capabilities in safe mode (i.e. the ability to connect to the internet) press F5 instead.
Major hospital halts surgeries
One of the biggest hospitals in the U.S., Mass General in Boston has announced it is halting all surgeries as a result of the outage.
"Due to the severity of this issue, all previously scheduled non-urgent surgeries, procedures, and medical visits are cancelled today", the hospital said in a statement posted to X.
A major worldwide software outage has affected many of our systems at Mass General Hospital, as well as many major businesses across the country. Due to the severity of this issue, all previously scheduled non-urgent surgeries, procedures, and medical visits are cancelled today. pic.twitter.com/AdZwhPNi2YJuly 19, 2024
New York State Chief Cyber Officer statement
The Chief Cyber Officer for New York State, Colin Ahern, has put out a statement regarding the ongoing outage issues.
“We are aware of an issue affecting Windows computers running a third party security software tool that is impacting systems and services worldwide. It is not a security incident or cyberattack,"
“We are working with our agencies, local governments, and the third party service provider to resolve any issues on impacted systems. Our priority is to ensure all 911 systems across New York are operational and able to address emergency response needs. The third party has identified a fix for the underlying issue and the New York State Office of Information Technology Services is actively working with other state agencies on a resolution. We do not yet have a timeline for full restoration.
“Governor Hochul is closely monitoring impacts to critical infrastructure, including finance and transportation. We recognize the impact this is having on services, not only across New York but also globally.”
How's this for irony?
As well as banks, airlines, media companies and hospitals the high-octane world of Formula 1 has been brought to a standstill by the ongoing CrowdStrike chaos.
Engineers for the Mercedes F1 team (which boasts superstar Lewis Hamilton as its lead driver) have been scrambling to fix PCs broken by the update in preparation for Sunday's Hungarian Grand Prix. And one particularly poetic image has been doing the rounds on social media today.
It shows a pair of Mercedes team members staring at the Blue Screen of Death whilst wearing shirts emblazoned with the CrowdStrike logo. To quote Morpheus: "Fate, it seems, is not without a sense of irony."
Y2K for real?
Readers of a certain vintage may remember a lot of concern in the lead up to the turn of the millennium that the date change to the year 2000 would throw the world's IT infrastructure into a tailspin.
The dreaded "Y2K" never came to pass. But the references to that panic are coming thick and fast today.
This is basically what we were all worried about with Y2K, except it's actually happened this time ☠️July 19, 2024
The Crowdstrike issue might be the largest IT outage in history.It's like Y2K, except it actually happened this time.Wild times! pic.twitter.com/cim15V1Do1July 19, 2024
UPS warning over deliveries
UPS has stated there's a potential for delivery delays to occur as a result of today's outage.
In a statement posted on its website, the delivery firm said it was doing what it could to ensure shipments remained on track.
"While the UPS network is operating and delivering in all areas, there is a potential for delivery delays due to a global technology outage," the company wrote.
"Contingency plans are in place to help ensure that shipments arrive at their final destinations as quickly as possible."
Problems with Apple Pay?
Despite the CrowdStrike outage specifically affecting Windows PCs and not Apple hardware, that hasn't precluded Cupertino's services also being affected. It seems that taking Apple Pay payments isn't a viable option for shops right now, due to a reliance on Windows on the backend.
According to a report from AppleInsider, supermarkets are struggling to accept mobile payments from Apple Pay and other providers because their Windows-running terminals are, currently, out of order. The site rightly points out that we don't yet know how widespread this is or how many users are affected. But it goes to show the huge knock-on effects of one error on our interconnected technological infrastructure.
I guess it's back to cold, hard cash for the time being.
"Not a security or cyber incident"
CrowdStrike CEO George Kurtz has posted a second statement on X explaining that he understands the "gravity of the situation". However, he maintains that the events of today were not the result of a "security or cyber incident".
"Today was not a security or cyber incident. Our customers remain fully protected," he wrote.
"We understand the gravity of the situation and are deeply sorry for the inconvenience and disruption. We are working with all impacted customers to ensure that systems are back up and they can deliver the services their customers are counting on. As noted earlier, the issue has been identified and a fix has been deployed. There was an issue with a Falcon content update for Windows Hosts."
So there you have it — no malicious actors behind the catastrophic events of today, it was simply an IT blunder. And CrowdStrike's stock price is certainly feeling the effects. The price has plummeted today and, at time of writing (12.11pm ET), is down 9% — admittedly not as bad as it was earlier in the day.
Today was not a security or cyber incident. Our customers remain fully protected.We understand the gravity of the situation and are deeply sorry for the inconvenience and disruption. We are working with all impacted customers to ensure that systems are back up and they can…July 19, 2024
FedEx and UPS having service disruptions
Delivery services like FedEx and UPS rely heavily on all sorts of infrastructure to connect packages with their destinations. The CrowdStrike problem is causing some issues for the delivery company.
FedEx's status page says it's dealing with "Active service disruptions." The company wasn't shy about pointing out what's causing the disruptions, citing a "global IT outage experienced by a third-party software vendor."
UPS is having similar problems, with its Service Alerts page saying, "A third-party software outage is impacting some UPS computer systems. While the UPS network is operating and delivering in all areas, there is a potential for delivery delays. Contingency plans are in place to help ensure that shipments arrive at their final destinations as quickly as possible."
Of course, FedEx isn't going to sit there and do nothing, and it says it has "activated contingency plans to mitigate impacts," much like UPS said in its statement above. Despite its best efforts, FedEx says, "potential delays are possible for package deliveries with a commitment of July 19, 2024." If you're expecting something important today, be prepared for the possibility that it won't arrive on time.
You can use FedEx's tracking system to see where your package is and if it will be delayed. UPS notes that its "UPS Service Guarantee does not apply to shipments affected by this event."
Microsoft's Satya Nadella responds to CrowdStrike situation
Yesterday, CrowdStrike released an update that began impacting IT systems globally. We are aware of this issue and are working closely with CrowdStrike and across the industry to provide customers technical guidance and support to safely bring their systems back online.July 19, 2024
After a long day of problems with no end in sight, Satya Nadella, Chairman and CEO at Microsoft, took to X to share his thoughts on the situation. While it's great that he addressed the problem, his post doesn't offer much in the way of new information.
He shared what we all know: "Yesterday, CrowdStrike released an update that began impacting IT systems globally."
As far as what Microsoft will do, the post is pretty vague. "We are aware of this issue and are working closely with CrowdStrike and across the industry to provide customers technical guidance and support to safely bring their systems back online," said Nadella.
The entire internet is holding its breath, waiting for something to be resolved. Nadella's post does little to ease the stress of the situation, but at least Microsoft knows what's happening and is on it.
Responding to Nadella, Elon Musk pointed out issues with the automotive industry in his own X post. He said, "This gave a seizure to the automotive supply chain," but didn't elaborate on what, specifically, is happening.
TechRadar's Lance Ulanoff on CNN
A bit of me on @CNN this morning talking about the #CloudStrike outage pic.twitter.com/0tckiXxxujJuly 19, 2024
TechRadar's Editor-at-Large, Lance Ulanoff, joined CNN to discuss the issue and how it's affecting airlines and other companies. You can see a snippet of his appearance in the X post above.
During the appearance, Ulanoff discussed what CrowdStrike is, how different the types of impacted entities are, and how various sectors are dealing with the outage.
The video is just under four minutes long, and it'll give you a great recap of what's happening so far if you've been out of the loop. Whether you plan on flying or just want to pay for stuff with your smartphone, this outage will probably touch your life somehow.
The lighter side of the CrowdStrike outage
First day at Crowdstrike, pushed a little update and taking the afternoon off ✌️ pic.twitter.com/bOs4qAKwu0July 19, 2024
This is a bad thing that happened to CrowdStrike and, by extension, Microsoft and tons of other companies. The outage adds a lot of extra work for people and wastes tremendous time.
But that's not stopping social media from going off with some pretty funny commentary on the CrowdStrike outage. I'm particularly fond of Vincent Flibustie's X post, which is embedded above. The post implied he was responsible for the update on his first day. It's satire and pretty well done. He followed up the initial post, saying he was fired. Sure, it's making light of someone's terrible day, but it's funny.
The real story behind Windows outage 🤣#Crowdstrike pic.twitter.com/ceb7v6nqxLJuly 19, 2024
Another X user, this time It's FOSS, posted a video claiming to show what happened at CrowdStrike (it's not what happened at all), but it sure is hilarious.
As a further reminder of how much work this is going to make for people, X user Trung Phan posted a video of a sad guy walking down a hallway with the blurb, "Every IT worker walking into work this Friday knowing that the global Crowdstrike BSOD global IT meltdown means they’ll have to cancel all weekend plans and work non-stop for the next 72 hours." It sounds terrible and completely relatable, with me managing the live blog related to the outage.
There's a wealth of great content on X related to CrowdStrike, Microsoft, and this situation. This one from Pooja Bishnoi springs to mind. And as much as I'd love to post funny videos all day, there's actual reporting to be done, and I must return to it.
All of CrowdStrike continues to work closely with impacted customers and partners to ensure that all systems are restored.I’m sharing the letter I sent to CrowdStrike’s customers and partners. As this incident is resolved, you have my commitment to provide full transparency on…July 19, 2024
CrowdStrike's George Kurtz posts blog
While we've heard from George Kurtz on X regarding what happened with the outage, the statement was pretty small. The CEO has taken to the company's blog for a more detailed explanation, though it's a lot of standard corporate speak.
In fact, much of what was already reported is confirmed through the blog post, but it's good to hear it directly.
He started with an apology, as you might expect. "I want to sincerely apologize directly to all of you for today’s outage. All of CrowdStrike understands the gravity and impact of the situation. We quickly identified the issue and deployed a fix, allowing us to focus diligently on restoring customer systems as our highest priority," said Kurtz in the post.
He reiterated that this wasn't a cyber attack and that Linux and Mac hosts weren't impacted. Kurtz discussed what the firm plans to do: "We are working closely with impacted customers and partners to ensure that all systems are restored, so you can deliver the services your customers rely on."
It sounds like the whole company is on it. He said, "We have mobilized all of CrowdStrike to help you and your teams."
As far as what's happening in the future, Kurtz said, "We know that adversaries and bad actors will try to exploit events like this. I encourage everyone to remain vigilant and ensure that you’re engaging with official CrowdStrike representatives. Our blog and technical support will continue to be the official channels for the latest updates."
Unsurprisingly, CrowdStrike really wants to keep its customers going forward. "You have my commitment to provide full transparency on how this occurred and steps we’re taking to prevent anything like this from happening again," Kurtz said to round out the blog post.
Most average internet users probably didn't know what CrowdStike was before today, so this massive issue could become the only thing people know about the company. That could be terrible for the company's reputation, so it makes sense for the CEO to try to smooth this over as much as he can, even if it doesn't undo what happened today.
Things are getting back to normal at Union Pacific Railroad
We're getting well into the afternoon/evening here in the U.S., and it sounds like at least some companies affected by the CrowdStrike outage are getting back on track.
Case in point: Union Pacific Railroad representatives have told CNBC that the "vast majority" of the railroad's freight engines are up and running.
“The vast majority of our customers’ freight is moving and full fluidity is returning to our network after this morning’s CrowdStrike software outage,” a railroad representative told CNBC. “In response to the outage our teams swiftly implemented protocols and communication plans, which allowed us to safely keep our trains running.”
However, companies and businesses around the world are still dealing with the after-effects of this global outage.
Border crossing between U.S. and Mexico impacted by outage
Sounds like folks attempting to cross the border between the U.S. and Mexico have run into unexpected delays because U.S. Customs and Border Protection is operating at reduced capacity due to the CrowdStrike outage.
According to a post published by the U.S. CBP Twitter account a few hours ago, the organization is working to remedy this but has not given an estimated timetable for when that will happen.
On July 19, U.S. Customs and Border Protection (CBP) is experiencing processing delays due to the global technology outage. We will continue our work to restore our systems to full capacity and provide updates as they become available. pic.twitter.com/mPjMdByNjpJuly 19, 2024
CrowdStrike releases Falcon fix blog
Late in the day on Friday, CrowdStrike released a post with tips and IT suggestions to help resolve the issue.
It's an attempt to get agencies and businesses back online by reverting the CrowdStrike Falcon platform to an earlier version, one before the update that caused all the crashes.
If you didn't know, Falcon is the company's core product suite. It acts as antivirus, threat detector, hack prevention, Cloud protection, ID protection and other features. It's like a suped-McAfee or BitDefender, but for large-scale operations.
The recovery instructions were partly written with the Claude 3.5 AI model, which might be the first time I've seen that in a communique from a company.
Check out their post if you're curious how the fixes work and need to be implemented.
For the most part, it appears that things are coming back online but the backlog of delays created by the crash is still being worked through across the globe.
Technical details of the CrowdStrike outage
As CrowdStrike continues to work with customers and partners to resolve this incident, our team has written a technical overview of today’s events. We will continue to update our findings as the investigation progresses. https://t.co/xIDlV7yKVhJuly 20, 2024
Following the blog article CrowdStrike made earlier explaining that it's sorry for what happened and that it's working with partners, the company put out another piece breaking down the technical details of what happened.
CrowdStrike CEO shared the article on X and said, "As CrowdStrike continues to work with customers and partners to resolve this incident, our team has written a technical overview of today’s events. We will continue to update our findings as the investigation progresses."
It's been well-covered that a software update was the root cause of the blue screen of death issues and subsequent outages, but this blog post gets into the details in a way we haven't seen yet.
"On July 19, 2024 at 04:09 UTC, as part of ongoing operations, CrowdStrike released a sensor configuration update to Windows systems. Sensor configuration updates are an ongoing part of the protection mechanisms of the Falcon platform. This configuration update triggered a logic error resulting in a system crash and blue screen (BSOD) on impacted systems," reads the intro of the post.
While the issue has been resolved on CrowdStrike's end, it could take some for the impacted companies to get everything back online and working again. Thankfully, it said, "customers may have specific support needs and we ask them to contact us directly" for those that need a little extra help getting back online.
Air travel is still a mess
While the actual outage is technically over, the aftermath will be felt for a long time as everyone tries to recover. Airlines were some of the most impacted by the outage, and they're still trying to get customers on flights to their destinations.
According to a report from Sky News, the Port of Dover is dealing with "hundreds of displaced" passengers. Reports suggest long delays for passengers and lost baggage as airlines scramble to get back on track.
Some experts have warned that it could take weeks for systems to fully recover from the global IT outage, which means anyone with a flight scheduled may want to leave some extra time and be prepared for possible delays, lost luggage or any of the other issues that have been reported.
An ABC News report suggests that airlines now have many of their planes and crews in the wrong places, which makes for a logistical nightmare.
One thing is clear: this will be a painful experience for everyone involved. Whether you work for an airline or want to travel somewhere, it will be more difficult than usual.
Microsoft details how many PC affected by CrowdStrike's outage
We've talked about the wide-reaching impact of the CrowdStrike outage. Despite how many systems it affected, the actual number of PCs receiving errors was relatively low.
According to Microsoft, 8.5 million Windows PCs worldwide were affected by the issue. And while that sounds like a considerable number, it's actually less than 1% of all Windows machines.
"We currently estimate that CrowdStrike’s update affected 8.5 million Windows devices, or less than one percent of all Windows machines," wrote David Weston, Microsoft's Vice President, Enterprise and OS Security.
However, the computers that used CrowdStrike and had issues were often part of critical systems like airlines, 911 operators, mass transit, banking and health services. Being such critical systems means that the issue impacts more people than just those using the PCs.
Sure, only the airline employees have their hands on the crashing computers, but their inability to do their job means no one can fly. If 911 operators can't access their computers, lives can't be saved.
If you were one of the 8.5 million PCs with 0x50 or 0x7E error codes resulting in the Blue Screen of Death (BSOD), Microsoft has a handy guide that'll show you how to fix the problem and get your system back online.
Airline delays and cancelations are still happening
Even though the CrowdStrike issue is resolved, airlines are still backed up and trying to recover. According to FlightAware (via CNN), about 5,400 flights in the U.S. were canceled Friday and Saturday. Of flights that got off the ground, more than 21,300 flights were delayed between the two days.
The worldwide numbers show that 2,869 flights were canceled and 34,926 were delayed. Considering the number of people on each flight, many people sit around in airports waiting for their flight and hoping to get to their destination.
That's already continuing today as airlines attempt to recover from the downtime. So far, 880 flights within, into or out of the U.S. have been canceled, and 992 have been delayed. Worldwide, 1,274 flights have been canceled, and 12,408 flights have been delayed.
We'll continue to monitor the airlines to see if things improve, but based on the number of flights already canceled this early in the day, it's not looking promising.
CrowdStrike offers guidance
In response to the widespread issues, CrowdStrike has published a Remediation and Guidance Hub designed to help people get their systems back online if they suffer from a BSOD error on their machine.
CrowdStrike is actively assisting customers affected by a defect in a recent content update for Windows hosts. Mac and Linux hosts were not impacted. The issue has been identified and isolated, and a fix has been deployed. This was not a cyberattack," reads the page.
It's a compilation of the blog posts and technical details page, all combined in an easy-to-navigate page. You can scroll through the individual pages or click the quick links to get the needed section.
These are different sections you can find:
- Statement From Our CEO
- Technical Details
- How Do I Identify Impacted Hosts?
- How Do I Remediate Impacted Hosts?
- How Do I Recover Bitlocker Keys?
- How Do I Recover Cloud–Based Environments?
- Third Party Vendor Information
- Additional Resources
Hopefully, having the first-hand info from CrowdStrike in one place will help people get the information they need to solve the problem.
Microsoft has released a new tool designed for IT admins to get machines up and running again after CrowdStrike’s faulty update. Workers in IT departments worldwide are probably breathing a huge sigh of relief thanks to this more straightforward repair method.
Microsft's new tool creates a bootable USB drive that admins can use to help recover impacted machines without too much hassle.
All kinds of fixes have been reported, including restarting the impacted machine multiple times. However, having this bootable disc is a surefire way to recover a machine that has crashed due to the faulty CrowdStrike update.
This tool doesn't require booting into Safe Mode or having an admin account. It's just accessing the disk without booting into the local copy of Windows. From there, it will prompt for the BitLocker recovery key and then continue to fix the CrowdStrike update.
It sounds easy enough that even people not in a company's IT department could handle it. Still, with the gravity of systems that have gone offline, thanks to this issue, it's probably better to leave it to the professionals.
Microsoft: 8.5 million Windows devices affected
CrowdStrike's update on Friday caused worldwide chaos and crashed an estimated 8.5 million Windows devices, according to Microsoft.
The company released a statement on Saturday explaining how it was working to help affected customers recover their systems quickly. But the company was also careful to point out that this was "not a Microsoft incident."
"We’re working around the clock and providing ongoing updates and support. Additionally, CrowdStrike has helped us develop a scalable solution that will help Microsoft’s Azure infrastructure accelerate a fix for CrowdStrike’s faulty update," the company wrote.
"While software updates may occasionally cause disturbances, significant incidents like the CrowdStrike event are infrequent. We currently estimate that CrowdStrike’s update affected 8.5 million Windows devices, or less than one percent of all Windows machines. While the percentage was small, the broad economic and societal impacts reflect the use of CrowdStrike by enterprises that run many critical services."
Microsoft has issued another free tool meant to help people affected by the CrowdStrike update that caused an IT nightmare for folks over the weekend. As Forbes reports, the tool in question is for IT administrators and exists to help them recover from the BSOD (blue screen of death) boot loop.
If you’re a regular person who doesn’t work in IT, then this tool (found here) likely won’t be useful as you’ve either not been affected by the CrowdStrike update or are in no position to affect change (especially if you’re stuck at an airport because of what’s happened). Still, the fact Microsoft is actively trying to help folks in IT fix this mess is a good thing.
CrowdStrike CEO called to testify
CrowdStrike CEO and co-founder George Kurtz has been called to testify in front of the House Homeland Security Committee about the events of the last few days.
“Recognizing that Americans will undoubtedly feel the lasting, real-world consequences of this incident, they deserve to know in detail how this incident happened and the mitigation steps CrowdStrike is taking,” wrote Homeland Security Chair Mark Green and Cybersecurity and Infrastructure Protection Subcommittee Chair Andrew Garbarino in a letter to CrowdStrike dated July 22.
The letter asks that Kurtz schedule a hearing with the subcommittee by the end of day Wednesday, as CrowdStrike continues to help with the cleanup operation. CrowdStrike spokesperson Kevin Benacci said in a statement the company “is actively in contact with relevant Congressional Committees."
Asking AI to explain what happened
As you can tell from the posts on this blog, there's been a lot of information to get a handle on since this crisis began. But can artificial intelligence help make sense of it? Or, even, help suggest ways to mitigate such an event happening in the future?
Using AI chatbots to make sense of live (or very recent) news events is a tricky proposition because you need a model with access to the internet and to be aware of the very real dangers of bias creeping into the answers.
Still, my colleague Ryan Morrison attempted this task by asking Google Gemini and Microsoft Copilot a series of targeted questions about the CrowdStrike outage. And you can read about how the two AI platforms compared right here.
Delta still suffering
While many U.S. airlines have started to find their feet following the outage, Delta is still suffering. The operator has canceled or delayed more flights than any other airline and is now facing a federal investigation from the U.S. Department of Transportation's Office of Aviation Consumer Protection.
The company says it is co-operating with officials but CEO Ed Bastian has also admitted it will take a few more days to straighten out.
"The issue impacted the Microsoft Windows operating system. Delta has a significant number of applications that use that system, and in particular one of our crew tracking-related tools was affected and unable toeffectively process the unprecedented number of changes triggered by the system shutdown," Bastian wrote in a statement this week.
More about online security
Latest
1 CommentComment from the forums
MikeBY I don't understand how an update ends up deployed to production in so many major customers without being tested first in a representative test group by these customers.
Cloudstrike itself recommends running production no newer than N-1 (one iteration back) and safer N-2.
You then pilot a small representative group on N-1 and an IT test group runs on N.
THIS IS SOP for Cloudstrike.
What happened? Did they break their own protocols?
Did they find a vulnerability so severe that it demanded violating all risk management protocols?
Explain how is it possible to break your product so severely and bypass all these steps that are in place to protect production?
This should never happen.
There is more to this story that needs to be told.Reply