AzureWatch alerts and other email communication is down (RESOLVED)

20. September 2015 00:04 by author Igor Papirov

AzureWatch's email provider (Amazon SES) has without warning suspended our account.  At this time, AzureWatch is not sending any emails out.  Portal, monitoring, auto-scaling, and other automation services continue to run.

CloudMonix email services were also impacted but have now been restored due to less complexity involved. 

 

UPDATE: On 2015/09/20 - email services were restored approximately 24hrs after initial outage.



AzureWatch Portal - Outage

8. June 2015 02:24 by author Igor Papirov

AzureWatch experienced an outage with its Portal/Website.  Monitoring/auto-scaling services were not impacted.  Outage lasted for approximately 6hrs



Azure and AzureWatch Outage

19. November 2014 15:18 by author Igor Papirov

Outages:

  • Major: AzureWatch alongside Azure has suffered a major outage that lasted for approximately 1.5hrs (from 6:48pm to 8:23pm, Central time, 11/18/2014). Outage was related to general network timeouts within Azure data centers and prevented AzureWatch properly performing monitoring.


AzureWatch Outage: SQL Azure issues

19. September 2013 04:53 by author Igor Papirov

Production Issue:

  • Severe: We're experiencing connectivity issues to our back-end databases hosted in SQL Azure. At this time monitoring and access to AzureWatch management portal are impacted. We apologize for this issue and have contacted Microsoft regarding resolution.
  • Outage duration: ~1hr


Service Degradation: Auto-scaling

27. August 2013 00:37 by author Igor Papirov

Production Issue:

  • Severe: A number of customers are reporting lack of expected auto-scaling actions. We've tracked the problem down to a mis-configured deployment of AzureWatch monitors.  Total time of the service degradation: approximately 10hrs.


AzureWatch: Production issue, intermittent outages resolved

15. August 2013 00:06 by author Igor Papirov

Production Issue:

  • Moderate: We have placed a temporary fix to limit the amount of connections to our SQL Azure databases in order to prevent the throttling that was occurring. Over the next days and weeks we will be working on a more permanent fix


AzureWatch: Production issue, Intermittent timeouts & outages

14. August 2013 21:13 by author Igor Papirov

Production Issue:

  • Moderate: We are currently experiencing intermittent outages with our SQL Azure environment.  Due to this, we're experiencing occasional timeouts or slowdowns on our AzureWatch Management Portal.  Furthermore, we're experiencing occasional monitoring cycle errors.  We apologize for this issue and are working with Microsoft to mitigate the problem


AzureWatch Monitoring: Production issue, False Azure Storage Outage alerts

11. May 2013 11:18 by author Igor Papirov

Production Issue:

  • Moderate: A production-level issue has caused one of our servers to begin sending false alerts regarding outages in Azure Storage.  The error has been corrected.  Duration of the issue: 33 minutes.  Start: 05/10/2013 8:39pm CST, End: 05/10/2013 9:11pm CST.  We apologize for the inconvenience.


AzureWatch Monitoring - Planned upgrade rolled back

1. April 2013 18:23 by author Igor Papirov

A planned upgrade to AzureWatch Main website, Management Portal and Monitoring Service has been rolled back after 10 minutes of running in production due to a discovered issue.  After the upgrade, a number of customers began receiving false alerts as AzureWatch attempted to scale their no-longer existing (but configured for scaling) deployments.  Prior to the upgrade, AzureWatch used to simply ignore deployments that were no longer present, even if it was configured to monitor and scale them.

While the discovered issue had no bearing on production use and monitoring, the amount of false alerts sent to customers mail boxes every minute was deemed inappropriate and upgrade has been rolled back until a fix can be applied.

We apologize for the inconvenience.



AzureWatch Monitoring - Azure-related Outage

23. February 2013 06:09 by author Igor Papirov

We are seeing world-wide connectivity errors connecting to Azure Storage.  Problems appear to be caused by an expired Azure SSL certificate

This Azure-related outage is impacting AzureWatch's ability to monitor customers subscriptions

 



AzureWatch Monitoring - Services restored

22. February 2013 23:58 by author Igor Papirov

All Azure-related services have been restored approximately 10 hours after the Azure Storage-related outage started.  AzureWatch was sending alerts to customers who monitor their Storage Accounts and who have enabled the "Alert on Failure" option.  During the outage, AzureWatch Management Portal was unavailable.

All AzureWatch related services are now running normally



AzureWatch Daily Reports - Re-enabled

21. December 2012 00:01 by author Igor Papirov

Outage Resolution:

  • Moderate: Data generation issue with Daily Performance Reports have been resolved.  The reports delivery will resume on 12/21/2012 @ 1am Central


AzureWatch Daily Reports - Disabled until further notice

20. December 2012 22:53 by author Igor Papirov

Outage:

  • Moderate: We've discovered an issue with data generation for Daily Performance Reports.  The report generation is disabled for all customers until the issue is resolved. Apologies for the inconvenience.