,

Google says Gmail outage was due to a bug in load balancing software, not DDoS attack

gmail-outageMillions of Gmail users around the world should have experienced not being able to log into their accounts on Monday. While there were many speculations and reports suggesting Gmail servers were under attack, Google said the problem was actually on the server side denying allegations it was suffering from a Distributed Denial of Service (DDoS) attack.

While the outage was short-lived, about an hour [although Google said the outage lasted for 18 minutes only], many of its services were affected. In fact, even Google Drive, Google Chat and Chrome web browser were affected by it pushing the company to issue a, somewhat, vague statement without really going into details what the problems were.

“We are currently experiencing an issue with some Google services. For everyone who is affected, we apologize for any inconvenience,” a Google spokesperson confirmed during Monday’s outage.

It wasn’t until two hours after the outage happened that Google engineers were able to provide details what happened in its servers and issued a more detailed explanation as to why such incident occurred.

“On Monday, 10 December 2012, we experienced an issue with Gmail and some users experienced slow performance or errors. For everyone who was affected, we apologize – we know you count on Google to work for you, and we worked hard to restore normal operation for you. Although our engineering team is still fully engaged on investigation, we are confident we have established the root cause of the event and corrected it. Our current best estimate is that a significant subset of users’ Gmail web queries were affected for an aggregate of 18 minutes, from ~08:54 – ~09:00 and then from ~09:04 – ~09:16 Pacific Time.”

Basically, it was a bug in Google’s load balancing configuration software that have caused the outage that affected many of the company’s services. There was a small change in its configuration that basically started to “throttle” traffic when it wasn’t supposed to. Considering the change happened in the core of Google’s infrastructure, many of its services were affected but Tim Steele, Google engineer, said it was not a problem specific to Gmail. In fact, Chrome Sync users were the first ones to experience the project.

To know more about the issue or what the engineers did to address the issue, visit Google’s dev forums here. A detailed report can also be found here.