If you are part of the majority of people who rely on Google services for things like search, email, online meetings, file management, and more, you know about Google’s outage last Monday on 15th of December 2020. No, it was not your computer or internet connection, millions of people (if not billions) suffered the consequences of the global downtime of Google.
Normally, when a service we work with is down, we tend to overlook it. It might be frustrating, but it’s no cause for panic. However, when it comes to Google, despite the fact that their outage lasted for only 45 minutes, the alarms were pounding worldwide, reminding us of our dependence on its digital services.
And this stands as a testimony and hard reminder that a large part of the population, especially during a pandemic, is teleworking and needs tools like Google and all of its services and platforms more than ever before.
So what exactly happened – what was the cause of it all?
Google Downtime Causes a Global Outrage
When a company like Google registers a downtime, it will cause a “global crisis”. However, although the fall had a global span, it seemed to be having a greater impact in Europe, especially in the United Kingdom and the Netherlands, but also in other points such as Germany, Spain, Portugal, Hungary, and Poland. The other countries notably affected were the United States (mainly on the east coast) and India.
This is not the first time Google is facing an outage this year. Back in August, there was a problem with Gmail video calls and attachments that lasted for seven hours. But it wasn’t as noticeable as this time, despite its brevity, the services were down for half an hour and it affected all the great services we are constantly using.
The tech giant recovered all its servers in less than an hour, as much as it has affected the lives of many people, this was an example of our lack of patience with technology and dependency on absolute immediacy.
What services were affected by Google’s outage?
The services that reported incidents were Gmail, Blogger, Drive, Google duo, Meets, YouTube, Hangouts, Maps, Calendar, Classroom, and Google Play – the official application download platform for Android devices.
As is the norm nowadays, the social networks have been filled with messages from confused and angry users criticizing the service, since many of them have seen their work affected. The word Google escalated to be a trending topic on Twitter in a matter of minutes, where users shared their complaints and problems caused by the crash.
What really happened to Google?
The official explanation of why this happened is that an authentication system had an outage for no more than 45 minutes due to an internal storage quota issue. This means that if you tried logging into your Google account to use any of their services, you wouldn’t be able to.
This is also the reason why the search Engine or services that are usable in f.ex. incognito still worked just fine. IN fact, even YouTube, if accessed as an anonymous user, worked. This brings us to many questions, as Google is blaming this outage on an internal storage quota problem. Did it just run out of space on their servers? Google can’t just run out of space… Can it?
What is clear is that until Google makes an official statement about what truly happened in those 45 minutes when a huge part of the world just stopped along with our Gmail, Google Drive, Maps, and calendars, we won’t know with total certainty what truly happened.
However, Google was fast and clear about pointing out one thing – that this was not a cyberattack, but simply an internal problem. And this is an extremely important note to make from Google’s standpoint as well as a huge sigh of relief for all of us who pump data into Google’s services every day.
Let’s face it, massive hacks seem to be a normal thing these days, so it was the first thing that many thought. Luckily, this was not the case here. And we can be sure about tht since, due to the nature of the failure, it couldn’t actually be an intentional attack because Google’s services are highly fragmented.
So, in case of an actual hacker attack, only part of the services would be affected, not all of them at the same time.
Is Google still secure?
When it comes to cybersecurity, the fall of the service suffered by Google does not pose any problems for the safety of the data of those who use its services in the cloud. The only main potential risks were the actual loss of data caused by a user using any of Google’s platforms at the time the failure occurred.
It is great news that Google confirmed all its servers are still secure and even if the cyberattacks have increased as a result of the pandemic.
This would have caused giant losses for companies that have switched to teleworking, many times without being well prepared for it, leaving default settings for programs and services and exposing themselves publicly on the Internet, which has opened many doors to potential cybercriminals who take advantage of any opportunity to carry out their attacks.
Due to the lack of awareness on the part of users, social engineering attacks continue to work today. The user is the weakest link in the chain, and the evolution of these attacks is becoming increasingly sophisticated.
Was this a costly failure?
The answer is no, not really. While some users might be skeptical or try and find a replacement for Google, the company, like the rest of the branches of its proprietary company, relies heavily on the advertising business.
This means that the more reproductions and views users give to the advertisements displayed on sites, the more the technology money-making machine accelerates. In the event of a service outage, even in some specific areas of the globe, as in this case, the machine slows down.
In this situation, it was a failure of the user authentication service, meaning that all Google advertising and navigation services remained active. So the economic impact on the advertising side was particularly high, but not as much as it could have been.
Was this the cause and effect of a bad system infrastructure?
When it comes to storing and processing data, in today’s world we rely on three types of systems – centralized, decentralized, and distributed. However, while systems like Blockchain use a decentralized network, this kind of setup is not yet as affordable, scalable, or practical for giant companies like Google.
Simply put it’s just too expensive to build, grow, and maintain a blockchain network for the number of users and data Google processes every second. So, what is a company like Google to do?
One way to increase their performance and security is by using distributed systems where a set of independent computers works as one – at least in the eyes of the user – increasing the capacity and speed of processing and storage, significantly. It also reduces the risk of hackers breaking the system and illegally acquire data as the data itself is distributed among the entire system.
In other words, distributed systems are better than traditional centralized systems because of the specific central point of failure. But, although the infrastructure of Google is distributed, with multiple servers around the world, these servers are controlled and upgraded centrally with shared software.
Or, for simplicity, Google uses a semi-distributed network, that still has a point of failure for some of its services, as we’ve seen during those longest 45 minutes of our recent lives.
Pros and Cons of a distributed system
The main difference that we can point out between a fully centralized system and a distributed one, is the way the architecture and workload are dispersed. Basically, a centralized system is like a group of people carrying one big rock. Remove one of them and the whole balance is ruined, then they will feel overloaded and drop the rock.
In contrast, a distributed system is like a group of people each carrying a small rock of their own. They’re working together, each carrying their fair share of the total weight. But if you remove one of the people in the group, the balance is not affected. They might be one rock short of completing their project, but they will still reach their destination just fine.
This is because a distributed architecture has a greater tolerance to failures than one that is centralized since it allows us to have greater control and information can be found in other nodes when one falls. In our metaphor, each person with their rock knows what their rock is meant for, and will be able to add their contribution to the system even when one of them is removed.
Distributed systems are much more robust than a centralized one due to this fault tolerance without affecting processes or data. Moreover, when it comes to security, distributes systems are far more secure than centralized ones since they distribute the data. So, if one node is breached, the rest of the information stored in the rest of the nodes is still safely tucked away.
This is not to say that there aren’t security risks. On the contrary, it is necessary to adopt specific security measures for this type of architecture so that the risk of potential attacks, their effects can be mitigated.
However, going back to Google and other companies like it, the centralization and distribution of servers are not mutually exclusive strategies. In simple terms, Google, and others like it use a semi-distributed system, meaning that some services are distributed for security and performance reasons.
Others, like user authentification, are centralized. This is why the outage we saw last Monday affected users around the entire world, but at the same time, platforms that did not rely on the authentification service were not affected.
The impact Google’s outage had on businesses
Another major issue, aside from the 45-minutes long inconvenience for users relying on Google’s services to do their work, is the effects it has on businesses. As third-party users, we don’t really think about the connection between our favorite services and Google. But let’s take a step back and do just that.
This means that part of the user authentification system relies on users being able to connect to their services via their Google accounts. No Google authentication service, no Spotify. It’s as simple as that. But while some of these services are still available for us as anonymous (not authenticated) users, companies like Uber for example might rely on Google Maps to operate their app.
No Google Maps, no Uber.
And the list can go on and on like that. We rely on third-party APIs or microservices because they are safer, more stable, more affordable, and faster to implement. But when they fail, we all fail. So, while Google will come out and say only a percentage of their users have been affected, that is only one perspective.
Alongside, hundreds of thousands of third-party users have been affected by Google’s outage even when they didn’t know they were relying on the giant’s services.
Google was down for less than an hour and yet we thought the world was ending. The dependency on the company’s servers and services is astonishing and we’ve reached a point where our lives rely on the correct performance of these networks and by the time these stop working, we do too.
The outage of Google filled us with questions about the stability, security, and functionality of the Internet. Noting all these questions and trying to answer them as accurately as possible, we take into conclusion the risk of a centralized system.
It’s like Murphy’s law—if something can go wrong, it will. In Google’s case, their semi-distributed network still relied on – at east in this situation – a centralized system to power the user authentification service.
So when Google failed, the world stopped. Not just Google users, but companies big and small who relied on Google microservices for their daily activities. This is probably the best example of a centralized system failure and how it affects our lives. Yes, Google might use a semi-distributes system, and we might use our own.
But at the end of the day, we are all connected in some way or another with Google’s services. We are the centralized system and our point of failure was Google.