CounterMeasures

Reduce Mean Time to Resolution (MTTR) and MTTD using Panopta’s CounterMeasures:

An exclusive automated diagnostics
& remediation service.

Reduce alert fatigue

Bolster system stability

Countless alerting integrations

Reclaim your Time

CounterMeasures automates the basic diagnostics and remediation tasks engineers spend their time on, freeing your team up to achieve more impactful goals and manage incidents proactively.

How Does it Work?

By enabling CounterMeasures on your infrastructure, you have access to an extensible platform that provides an extra layer of automation in your operations workflow. At its core, CounterMeasures allows you to create automated responses to incidents.

Admins spend a lot of valuable time resolving incidents that require logging into infrastructure just to do a restart or run basic commands. Coupled with having to diagnose the problem, even simple incidents can interfere with goals and workflow. CounterMeasures can remove some of the time-consuming steps, leading to a lower MTTR and more time for your team to innovate.

Reclaim your Time

CounterMeasures automates the basic diagnostics and remediation tasks engineers spend their time on, freeing your team up to achieve more impactful goals and manage incidents proactively.

How Does it Work?

By enabling CounterMeasures on your infrastructure, you have access to an extensible platform that provides an extra layer of automation in your operations workflow. At its core, CounterMeasures allows you to create automated responses to incidents.

Admins spend a lot of valuable time resolving incidents that require logging into infrastructure just to do a restart or run basic commands. Coupled with having to diagnose the problem, even simple incidents can interfere with goals and workflow. CounterMeasures can remove some of the time-consuming steps, leading to a lower MTTR and more time for your team to innovate.

Fully Extensible

Use either the out of box diagnostic actions and remediation scripts, or create your own custom CounterMeasures to fit the exact needs of your infrastructure and team.

To create a custom CounterMeasure, all you need to do is create a script, add the CounterMeasure to a threshold, and Panopta will run it.

Here’s a Plugin Example

Adding CounterMeasures to a threshold
  • Enable CounterMeasures in your agent config or manifest file
  • Add a CounterMeasure to a threshold, such as Disk % Used
  • The CounterMeasure will run when the threshold is crossed

Additional Features

Optional Approval Workflow

For more intrusive actions, the CounterMeasures optional approval workflow allows you to set up admin approval to ensure it takes the correct action to resolve the issue.

Flexible Controls

Built-in support limiting the number of times a CounterMeasure can run as well as how long before it times out.

Agile Alerting

For busy teams who might not get to incidents as soon as they come in, systems admins will receive an alert about an incident, and the CounterMeasure will only run if a team member has not resolved the issue within a set timeframe.

Diagnostics & Remediation

Let CounterMeasures gather error logs, virtual statistics, or service statuses giving your admins the information they need to resolve incidents as soon as they’re detected.

Small Teams

Small IT teams often have a regular on-call schedule for nights and weekends, so when they experience issues, small teams tend to feel more stress. For example, if a database server is consistently overloading every night for a week straight, the team might struggle to find a permanent solution because they are too busy putting out fires. In addition, a lot of the tasks they’re doing to resolve the incidents temporarily are repetitive and distract from more important work.

By implementing CounterMeasures on the affected instance, it can automate the diagnostic commands an admin would run when they are alerted to an incident. The reports from those diagnostics are attached to the incident report, giving the team members everything they need to resolve incidents before logging into the system. Using the reports, an admin can know exactly what they need to do and implement the solution as soon as they’ve logged in.

Large Teams

In large IT teams, it’s not uncommon for varying levels of access to affect incident management. For example, members of the NOC team often don’t have visibility into the production servers. When a new incident comes in, the NOC has to asses which team to engage to resolve the issue with minimal context. Sometimes, this results in an incident being sent to the wrong team, and then having to be transferred by the NOC to the correct team, making the time to resolution much longer.

CounterMeasures can run diagnostics which will be included on the incident report and accessible to the NOC, even if they don’t have access to the relevant production servers. This provides the NOC the context they need to more accurately engage the correct team to resolve any incidents which come through. Taking less time to actually resolve the issue, and also not distracting teams with incidents that aren’t related to their work.

Sometimes the same problems keep coming up, CounterMeasures can take on the simple restart actions so your admins don’t have to drop everything to resolve repetitive incidents.

Small Teams

When a server is nearly out of Disk space, it’s not uncommon that an engineer on a small team will need to drop everything to handle it. Deleting old or archiving old log files is a quick fix, so a more permanent solution can be put in later. This disturbs the workflow, and often slows down the entire team.

CounterMeasures can step in to do the work for an engineer. By setting up a CounterMeasure to respond when Disk space reaches a particular percentage, it will automatically archive or delete some log files and alert the team that the incident occurred. The team can keep working, and once they’ve reached a good stopping point, they can go and put a more permanent solution in place.

Large Teams

While large IT teams usually have a lot of bandwidth on their own, stopping to resolve incidents can disrupt workflow and distract teams. Having a NOC in place can help triage incidents, but a team member will still need to pause other initiatives to fix the problem when the NOC notifies the team.

Countermeasures can automate the simple resolutions like restarting or killing a service, or bigger actions like rebooting a server. Using CounterMeasures to resolve simple issues will reduce alert fatigue and disrupt workflow less often. In addition, CounterMeasures can alert a member of the NOC team before taking larger actions like rebooting, even if the NOC still needs to confirm with the team in question that it’s okay for CounterMeasures to reboot the server, it’s less disruptive than needing to perform the action themselves.