Some tech services – connectivity, communications, security, among others – fall firmly into the ‘high availability’ bracket. They need to be protected, preserved and ideally, work uninterrupted. Yet, as illustrated by the recent Microsoft Outlook service outage, even the biggest global providers experience problems. 

So, is high availability a big enough priority? Are organisations taking the right steps to ensure key services remain up and running under extreme circumstances? It can mean different things to different organisations. Every business running high availability services will have their own definition of what that means, and how difficult things become if it fails. 

What’s The Problem? 

Whether the technology in question is in-house or delivered via a service provider, every organisation should be calculating how much downtime is too much.  The tech industry often relies on a percentage ‘score’ based on how close to 100% availability is planned for the technology in a year. 

Ultimately, if the downside of downtime is serious enough, organisations can adopt one approach in particular to improve performance, and that is to have no single point of failure.  High availability systems need high levels of redundancy for the occasions where something does go wrong, or something unforeseen throws a spanner in the works.

Some service providers offer ‘fail over’ in the event of sudden downtime, using multiple data centres. Data network connectivity can also be protected by routing traffic via more than one provider. Broadband provision can be delivered via the fixed line or mobile networks – if the usual connection goes down, mobile routers can very effectively fill any gaps in service and maintain availability. 

But is a bit of downtime really much of a problem? One recent study from ITIC found that 98% of large enterprises with more than 1,000 employees put the cost of a single hour of downtime at over $100,000. At the extremes, a third of enterprises in the study put the hourly cost at $1 million or more. After Amazon Web Services, a public cloud provider, suffered a service outage in February, an A-Z of web-based services and thousands of their users were affected. Various reports put the total cost of that outage to those involved in the region of $150 million. 

A Service & Communication Blind Spot 

So, for most businesses, failure of a high availability external service is likely to be costly. In the process, it could be the biggest customer service challenge they ever face. That’s why an inadequate public reaction to downtime of a key service can seriously compound the problem. Poor communication, slow response times, a lack of remedial action and unwillingness to ‘make good’ make a bad problem into a terrible one. Social media often gives detailed insight into how bad the reaction can be – what it doesn’t tell us is the effect of a high availability failure on the people affected internally. 

In a competitive service-led tech environment, no-one will win any plaudits for arguing that service levels and availability can’t get better. All the pressure is focused on improving reliability. Ultimately, organisations that can balance the value of their high availability services against the impact of losing them, and then make the investment required to mitigate the risk will be on a firm footing.