Free Trial!
Sign up now for a FREE 30-day trial of our monitoring and outage management services. No obligation, start monitoring in 2 minutes!
Outage Management Command Center
Dealing with an outage, especially of a high-value system, can be similar to fighting a raging wildfire. Unfortunately, many operations teams try to fight the fire with only rudimentary equipment and tools. Our goal is to fully equip your team to jump into action at the first sign of smoke and to operate as effectively as possible until the fire is out.
During an Outage
The first thing you need when faced with an unexpected outage is as much information as possible. The Outage Management Command Center (OMCC) provides a consolidated outage interface in the control panel that gives an overview of the outage and any related outages. It provides crucial information to assess the situation: the pre-outage server state, detailed check results, and a clear picture of your resources.
The second thing you need is tools to help your team communicate. The OMCC is a centralized place for your team to meet during an outage and coordinate a response. The OMCC brings together several forms of communication, including:
- An outage-specific email list, where all messages are automatically broadcast to everyone involved in the outage. Outage notices are automatically tied into the mailing list - simply reply to the outage notice and your response is routed to everyone involved in the outage.
- A web-based chat room where team members can discuss the situation. This is a pure web tool, with no software to install and configure.
All email and chat messages are added to the Outage Log so that people joining in can quickly get up to speed on what's been done.
These tools make it easy for your operations team to respond to outages in a timely, efficient manner. And, because this infrastructure is hosted on our redundant servers and not on your local network, it is not affected by the very outages you are attempting to manage.
After an Outage
Once you've resolved the source of the outage, it's important to make sure that you prevent the same problem from occuring again in the future. Oftentimes, many details of what was done in the heat of the moment are lost, and the same type of outage repeats again and again. Ending this cycle is a crucial step to improving the overall availability of your online presence.
Fortunately, with your team working in the OMCC throughout an outage, all the details of what they have done are captured, and are available to review after things have calmed down, rather than being lost to groggy memories or the next problem that comes up. Outage logs are archived permanently for examination, whether it's the next day or six months down the road when a similar problem arises.
