We are pleased to introduce a number of exciting new features to AzureWatch!

Support for multiple Azure subscriptions

Users can now instrument a single AzureWatch account to monitor and auto-scale any number of Azure Subscriptions (previously the limit was one).  Furthermore, users no longer need to transfer all of their diagnostic data into a single storage account.  AzureWatch will automatically detect which storage account a particular Role is transferring its diagnostic data to.  This should simplify account management, billing and configuration steps for AzureWatch users.

 

Support for alerts based on conditions of individual servers

In addition to receiving alerts based on metrics aggregated across all servers within a compute Role, AzureWatch users can now receive alerts based on conditions of individual servers.  This can help support personnel quickly pin-point problematic servers and take appropriate action.

 

Support for self-healing!

In addition to generating alerts for individual servers, AzureWatch can now automatically reboot or re-image servers!  This is particularly useful for servers that are leaking memory, disk space, or other resources.

 

Enhanced support for multiple mailboxes

AzureWatch can now send alerts to multiple mailboxes at an account level.  Furthermore, each individual alert can be further customized to be sent to more mailboxes if needed. This lets AzureWatch users send alerts of different priority levels or areas of responsibility to different contacts.

 

In Detail

Each rule now has a flag (1) that allows it to inspect metric data server-by-server OR aggregated across all servers within a Role.  If a Rule is inspecting metric data across all servers within a Role (the default scenario) it can auto-scale servers within that Role and send out alerts based on general conditions of all servers within a Role.  However, if users choose to evaluate a particular rule on an individual server basis, it  can reboot or re-image servers instead (2).  This is obviously very useful for conditions related to memory leaks, fragmented or leaking disk space, accumulating "stuck" IIS requests, etc.  While AzureWatch uses Azure Service Management API in order to request rebooting and re-imaging of servers, users should exercise caution and consider putting appropriate thresholds or throttling in place to avoid inappropriate reboots.  

When performing any sort of an action based on Rule (scaling, rebooting, or simply alerting), AzureWatch will send notification to every mailbox specified at the account level.  However, more mailboxes can be configured at an individual Rule level (3).

 

Remarks

When Azure reboots a server (role instance), it will take it offline, restart the underlying operating system for that server, and brings the role instance back online. Any data that is written to the local disk is persisted across reboots. Any data that is in-memory is lost.

When Azure reimages a server (role instance), it will take it offline and write a fresh installation of the Windows Azure guest operating system to the virtual machine. The instance is then brought back online. Windows Azure attempts to maintain data in any local storage resources when the server is reimaged; however, in case of a transient hardware failure, the local storage resource may be lost.  Any data that is written to a local directory other than that defined by the local storage resource will be lost when the instance is reimaged.