Gone in Minutes, Out for Hours: Outage Shakes Facebook
When apps used by billions of people worldwide blinked out, lives were disrupted, businesses were cut off from customers — and some Facebook employees were locked out of their offices.
SAN FRANCISCO — Facebook and its family of apps, including Instagram and WhatsApp, were inaccessible for hours on Monday, taking out a vital communications platform used by billions and showcasing just how dependent the world has become on a company that is under intense scrutiny.
Facebook’s apps — which include Facebook, Instagram, WhatsApp, Messenger and Oculus — began displaying error messages around 11:40 a.m. Eastern time, users reported. Within minutes, Facebook had disappeared from the internet. The outage lasted over five hours, before some apps slowly flickered back to life, though the company cautioned the services would take time to stabilize.
Even so, the impact was far-reaching and severe. Facebook has built itself into a linchpin platform with messaging, livestreaming, virtual reality and many other digital services. In some countries, like Myanmar and India, Facebook is synonymous with the internet. More than 3.5 billion people around the world use Facebook, Instagram, Messenger and WhatsApp to communicate with friends and family, distribute political messaging, and expand their businesses through advertising and outreach.
Facebook is used to sign in to many other apps and services, leading to unexpected domino effects such as people not being able to log into shopping websites or sign into their smart TVs, thermostats and other internet-connected devices.
Technology outages are not uncommon, but to have so many apps go dark from the world’s largest social media company at the same time was highly unusual. Facebook’s last significant outage was in 2019, when a technical error affected its sites for 24 hours, in a reminder that a snafu can cripple even the most powerful internet companies.
This time, the cause of the outage remained unclear. It was unlikely that a cyberattack was the culprit because a hack generally does not affect so many apps at once, said two members of Facebook’s security team, who spoke on the condition of anonymity. Security experts said the problem most likely stemmed instead from a problem with Facebook’s server computers, which were not letting people connect to its sites like Instagram and WhatsApp.
Facebook eventually restored service after a team got access to the server computers at a data center in Santa Clara, Calif., three people with knowledge of the matter said. Then they were able to reset them.
The company apologized for the outage. “We’re sorry,” it said on Twitter after its apps started becoming accessible again. “Thank you for bearing with us.”
The outage added to Facebook’s mounting difficulties. For weeks, the company has been under fire related to a whistle-blower, Frances Haugen, a former Facebook product manager who amassed thousands of pages of internal research. She has since distributed the cache to the news media, lawmakers and regulators, revealing that Facebook knew of many harms that its services were causing, including that Instagram made teenage girls feel worse about themselves.
The revelations have prompted an outcry among regulators, lawmakers and the public. Ms. Haugen, who revealed her identity on Sunday online and on “60 Minutes,” is scheduled to testify on Tuesday in Congress about Facebook’s impact on young users.
“Today’s outage brought our reliance on Facebook — and its properties like WhatsApp and Instagram — into sharp relief,” said Brooke Erin Duffy, a professor of communications at Cornell University. “The abruptness of today’s outage highlights the staggering level of precarity that structures our increasingly digitally mediated work economy.”
When the outage began on Monday morning, Facebook and Instagram users quickly used Twitter to lament and poke fun at their inability to use the apps. The hashtag #facebookdown also started trending. Memes about the incident proliferated.
But a real toll soon emerged, because many people worldwide rely on the apps to conduct their daily lives.
“With Facebook being down we’re losing thousands in sales,” said Mark Donnelly, a start-up founder in Ireland who runs HUH Clothing, a fashion brand focused on mental health that uses Facebook and Instagram to reach customers. “It may not sound like a lot to others, but missing out on four or five hours of sales could be the difference between paying the electricity bill or rent for the month.”
Samir Munir, who owns a food-delivery service in Delhi, said he was unable to reach clients or fulfill orders because he runs the business through his Facebook page and takes orders via WhatsApp.
“Everything is down, my whole business is down,” he said.
Douglas Veney, a gamer in Cleveland who goes by GoodGameBro and who is paid by viewers and subscribers on Facebook Gaming, said, “It’s hard when your primary platform for income for a lot of people goes down.” He called the situation “scary.”
Inside Facebook, workers also scrambled because their internal systems stopped functioning. The company’s global security team “was notified of a system outage affecting all Facebook internal systems and tools,” according to an internal memo sent to employees and shared with The New York Times. Those tools included security systems, an internal calendar and scheduling tools, the memo said.
Employees said they had trouble making calls from work-issued cellphones and receiving emails from people outside the company. Facebook’s internal communications platform, Workplace, was also taken out, leaving many unable to do their jobs. Some turned to other platforms to communicate, including LinkedIn and Zoom as well as Discord chat rooms.
Some Facebook employees who had returned to working in the office were also unable to enter buildings and conference rooms because their digital badges stopped working. Security engineers said they were hampered from assessing the outage because they could not get to server areas.
Facebook’s global security operations center determined the outage was “a HIGH risk to the People, MODERATE risk to Assets and a HIGH risk to the Reputation of Facebook,” the company memo said.
A small team of employees was soon dispatched to Facebook’s Santa Clara data center to try a “manual reset” of the company’s servers, according to an internal memo.
Several Facebook workers called the outage the equivalent of a “snow day,” a sentiment that was publicly echoed by Adam Mosseri, the head of Instagram.
In Facebook’s early days, the site experienced occasional outages as millions of new users flocked to the network. Over the years, it spent billions of dollars to build out its infrastructure and services, spinning up enormous data centers in cities including Prineville, Ore., and Fort Worth.
The company has also been trying to integrate the underlying technical infrastructure of Facebook, WhatsApp and Instagram for several years.
John Graham-Cumming, the chief technology officer of Cloudflare, a web infrastructure company, said in an interview that Monday’s problem was most likely a misconfiguration of Facebook’s servers.
Computers convert websites such as facebook.com to numeric internet protocol addresses, through a system that is likened to a phone’s address book. Facebook’s issue was the equivalent of removing people’s phone numbers from under their names in their address book, making it impossible to call them, he said. Because Cloudflare directs traffic to Facebook, it became aware of the outage early on and saw the incident’s scope.
“It was as if Facebook just said, ‘Goodbye, we’re leaving now,’” Mr. Graham-Cumming said.
By:
By Mike Isaac and Sheera Frenkel
Source: Newyork Times