Saturday August 04, 2012

Microsoft Offers Explanation for Azure’s Downtime

It took Microsoft about a week to completely analyze the problem and formulate a response to last week’s Azure cloud service outage for over two hours in Western Europe. Simply put: Microsoft said a safety valve shut down the cloud service to prevent a further catastrophic cascading outage. Microsoft reports the initial problem and resulting glitches have now been corrected.

The resulting surge in traffic brought on by those messages triggered other bugs, and pushed the CPU usage of some of the machines in the cluster to 100 percent.