As everyone is probably aware, Windows Azure suffered a world-wide Azure Storage outage on Friday 02/22/2013. This outage was caused by an expired Microsoft SSL certificate. The outage impacted Azure Storage, Azure Websites, Service Bus, Media Services, ACS, and Azure Management Portal and lasted for approximately ten hours.
AzureWatch monitors remained active during the outage. Customers who monitored their storage accounts with "Alert on Failure" option turned on, began receiving alerts at approximately 8:32PM UTC (well in advance of any outage notice on Microsoft's Azure Dashboard).
It is important to point out that AzureWatch's monitoring was impacted because key metrics located within our customers' Azure deployments were inaccessible. Furthermore, AzureWatch Management Portal was unavailable because it currently relies on Azure Storage. To help mitigate our outage, our engineers were available via recently implemented online chat interface to provide extra support to customers logging into our portal.
Overall, this outage yet again underscores the importance of monitoring. No cloud provider will have a perfect up-time record. Thus, the faster you know that something is wrong, the faster you can react with a contingency plan. Even if you do not require all the sophisticated features of AzureWatch, we suggest that you at the minimum use our simple but effective free monitoring utility AzurePing that can send you alerts when it is unable to access your Azure resources.