4 Ways to Improve SaaS Operations with Monitoring

According to Gartner research, worldwide SaaS revenue grew 17.9% in 2012 to $14.5 billion and will top $22.1 billion in 2015. As budgets for big IT projects shrink and mobile devices provide ubiquitous connectivity, many businesses are getting more comfortable having their business-critical applications delivered as a service rather than on-site installed software. As a result, more and more new tools are being provided as SaaS only.

For entrepreneurs, the model is attractive as well – it provides an opportunity to launch a new idea with significantly less startup capital than in the past. However, there are a number of challenges to the SaaS model. The centralized nature makes upgrades and deployment much easier than traditional software, allows you to leverage shared infrastructure across customers and enables rapid iteration of ideas. But it also means that you have all your eggs in one basket, with the risk of a service problem having much wider impact.

The Challenges of SaaS

While attractive, it’s definitely not an easy route to do really well. In the 5+ years we’ve been offering monitoring as a service, we’ve encountered a number of challenges related to the fundamental setup of SaaS applications, and have managed to build a thriving business around it. Based on our experience, here are a number of key challenges that you’ll face.

Every Minute Counts

Unlike a content site or e-commerce site, your customers likely use your tool day in and day out, accessing continually throughout their work day. Disruptions to their regular work cycle can quickly degrade the trust they place in your service. Once they have lost trust, it is very difficult to keep them as a customer, because you’re operating in a competitive market. For us, this has meant an ongoing focus on making our service as accurate as possible, as false positives and missed alerts are the quickest way to lose confidence in the monitoring world.

Regardless of the benefits you may provide, without trust your customers will have a hard time placing their critical data and operations in your hands.

Think Global

The global nature of the SaaS marketplace is definitely enticing from a sales perspective, but it also adds to the challenges. For a successful service, customers are often distributed around the world. This makes any sort of scheduled maintenance or downtime difficult to accommodate as it is always business hours somewhere. It also means that you have to be reactive to problems as they arise whenever that might be.

Additionally, it means that that your application needs to be designed to handle things like timezones and the varying ways of handling daylight savings time around the world (this is a full post of it’s own, which we’ll publish soon!) and other localization aspects.

Tread carefully during the initial dance

Finally, with the free trial or freemium model, disruptions in service are particularly dangerous because these are the customers you need to convert or up-sell. During this crucial initial phase of your relationship, potential customers are even more wary of problems. Not that outages are ever a good thing, but service disruptions during a trial period can effectively torpedo your conversion chances.

How can monitoring help?

Fortunately, there are ways to deal with these challenges. One of which is monitoring, which serves as your 24×7 eyes and ears for your infrastructure. However, the impact of monitoring is only as effective as the thought you put into it. Based on our experience working with thousands of SaaS providers, as well as running our own global operations, we have the following suggestions to improve the impact of your monitoring:

1. Monitor everything

Most SaaS tools rely on a number of third party services to function, any one of which can break and bring the service down. Just tracking the status of your core servers is not enough to ensure that everything your customers depend on is working correctly.

Send and receive email? Make sure you monitor your inbound and outbound mail servers. Use an external service for authentication, logging or hosting of Javascript/CSS/font resources? Make sure you know when they have problems as they can bubble up to cause problems with your service. Use a queuing system like RabbitMQ for asynchronous processing? Make sure it is accepting connections and is processing messages.

Ultimately, all of these pieces are interconnected and combine together to produce your application, and any one of them can disrupt your customer’s experience. Make sure that you know as soon as there are problems with any of them so you can minimize the impact.

2. Track system resources

In addition to making sure that your application is responding to network requests, also monitor all of the critical system resources they rely on such as CPU levels, memory capacity and disk usage. Beyond these basics, make sure to track any special-purpose metrics such as critical database performance indicators, etc. Ideally you should instrument your application to track any domain-specific metrics that indicate how your application is performing.

Having this information lets you avoid catastrophes, ranging from something as simple as running out of disk space on an application server to hard-to-predict problems that arise as corner cases. Beyond the immediate service disruptions these can cause, they often require a painful, time-consuming maintenance session to clean up.

This level of monitoring also gives you trend data on your resource usage for forecasting and budgeting purposes. This is critical as your business expands and you need to look to grow your infrastructure – being able to plan grown based on hard resource data lets you more accurately predict your needs which means you can properly time your growth and ultimately save money on servers and other infrastructure.

3. Get a single-pane view of your infrastructure

When running a quickly growing SaaS service, you’re actively managing all aspects of your business, with technical, sales, marketing and operations issues all jockeying for your attention. For your day-to-day operations, you don’t have time to do mental interpolation between numerous systems in order to get a view of how your infrastructure is doing.

This is even more true when problems arise – you need to be able to quickly figure out what the current status is, how it got there, and decide what needs to be done to fix it. Your monitoring tools should support this and enable your team to respond rather than getting in the way of decisions.

Make sure that whatever tool you use for monitoring provides a single-pane view of all of your infrastructure, regardless of where it sits and who has primary responsibility for it. By bringing the status of all of your components together into one tool, you can quickly detect systematic problems even if they span different hosting companies, datacenters or third-party service providers.

4. Be prepared for things to break

Despite our best efforts, Murphy’s Law continues to apply in the SaaS age. Regardless of how hard you prepare, things can and will break. Your best approach for dealing with this is to actively think through all of the most likely failure scenarios and anticipate the signs of the problem and how to react and correct them. Doing this in advance helps you and your team build “muscle memory” which allows you to react faster when real disaster strikes. Doing a test drive of failure responses also verifies that your backups and disaster recovery systems are properly configured and functioning correctly.

As part of this, you should do dry runs of detecting and reacting to alerts from your monitoring system. When things really do break, you don’t want to find out that you have a typo in someone’s phone number, your alert emails are getting flagged as spam, or that you don’t have the critical login credentials for the systems that you need to reach in order to put out the fires.

Be ready to respond

While nothing will eliminate problems from arising, a complete monitoring setup will give you an advantage in responding to problems efficiently, which will you deliver a more reliable SaaS service and help escalate the growth of your business.

If your current monitoring system is lacking in any of these areas, get in touch and we’d be happy to do a complimentary assessment of your monitoring setup and recommend ways to improve your visibility.