The internet is now an essential household utility for many Americans, maybe even on the same footing as running water.
It turns out the internet’s pipes can spring a leak, too.
In the past three weeks, two major outages at Amazon’s cloud computing services have led to widespread disruptions at other online services. In November, a problem at Comcast, one of the largest internet service providers in the U.S., led to widespread outages (Comcast owns NBC News.) And in June, websites around the world were temporarily knocked offline when Fastly, a cloud computing service provider, dealt with “service configuration” issues.
The steady drumbeat of issues underscore that the internet, despite all it’s capable of, is sometimes fragile.
“It’s expected to be like your power or your water, and they sometimes go down,” said Steve Moore, the chief security strategist at the cybersecurity firm Exabeam.
The latest disruption occurred Wednesday, when customers of websites including DoorDash and Hulu complained they couldn’t connect. The problems were traced to Amazon Web Services, the most widely used cloud services company, which reported that outages in two of its 26 geographic regions were affecting services nationwide.
A similar disruption took place Dec. 7, crippling video streams, halting internet-connected robot vacuum cleaners and even shutting down pet food dispensers in a series of reminders of how much life has moved online, especially during the pandemic. AWS published an unusually detailed description of what went wrong, along with an apology.
The incidents helped to explode the illusion, reinforced by decades of steadily improving internet speed and reliability, that everyday consumers can rely on online services to be available without fail.
It used to be that online video meant watching “a low-res video for five minutes,” said Robert Blumofe, executive vice president and chief technology officer at Akamai Technologies. The company sells security services as well as “edge computing” capabilities, a kind of distributed technology that doesn’t rely as much on centralized data centers.
“Now, there’s a very strong expectation that you could watch an entire movie in high-res,” Blumofe said. “There’s a recency bias. We remember the immediate and the now more than we remember the way things were in the past,” when outages were frequent, he said.
In other words, some Americans who enjoy reliable internet access may have become a little spoiled.
Experts in computer science and security said that the interruptions don’t really call into question the fundamental design of the internet, where one of the founding ideas was that a distributed system can mostly stay functioning even if one piece goes down.
But they said the problems are rooted in the uneven development of the internet, as certain data centers are more important than others, cloud businesses run by Amazon, Google and Microsoft are concentrating more power, and corporate customers of cloud services don’t always want to pay extra for backup systems and staff.
Sean O’Brien, a lecturer in cybersecurity at Yale Law School, said the outages call into question the wisdom of relying so much on big data centers.
“‘The cloud’ has never been sustainable, and is merely a euphemism for concentrated network resources controlled by a centralized entity,” he said, adding that alternatives like peer-to-peer technology and edge computing may gain favor. He wrote after last week’s outage that the big cloud providers amounted to a “feudal” system.
Cloud service providers make money by selling server space to other businesses on flexible terms and with specialized expertise, reducing the need for companies to manage their own servers. And although they rarely fail, they get attention when they do. Another AWS outage in November 2020 affected clients such as Apple.
“There are many points of failure whose unavailability or sub-optimal operation would affect the entire global experience of the internet,” said Vahid Behzadan, an assistant professor in computer science at the University of New Haven.
Some of those points of failure — such as the AWS “us-east-1” region — have become notorious among tech workers who share their experiences with outages on industry message boards.
“The fact that we’ve had repeated outages in a short period of time is a cause for alarm,” Behzadan said, noting that U.S. businesses have staked a lot on the assumption that cloud services are resilient.
But he also said that if outages become more common or publicly visible, corporate clients are likely to respond in kind by spending more for backup systems to ensure they’re resilient in case of a breakdown — having contracts with both Google and Amazon, for example. There’s now a rekindled industry debate on whether to go “multicloud,” CNBC reported, and companies across sectors are increasing their spending on edge computing tools.
“The internet will not die any time soon. But whatever won’t kill the internet makes it stronger,” Behzadan said.
Moore, from Exabeam, said that the tightening labor market nationwide might also be having an effect on cloud services and internet reliability, as any increase in churn reduces the experience level of the people in charge.
“We’re coming off unprecedented times where people are incredibly stressed and the expectations for cloud infrastructure have been higher than ever,” he said. “Organizations are playing catch-up.”