Data Center Downtime: Why Better Awareness Offers More than Redundancy
Manish Sharma, Vice President, Chief Technology and Chief Product Officer, Honeywell Building Technologies
February 15, 2022
Most people don’t even think about data centers until they go down. When they do, airlines are forced to cancel flights, hospitals can’t access insurance records or employees can’t pass through access-controlled doors at work. If you have trouble relating to any of these crises, how about suddenly finding your favorite social media channel offline? At the macro level, data centers play an ever more critical role in keeping the global economy productive; while closer to home, they support the conveniences of our daily life — from working remotely to shopping online.
It’s not just the cost of data center downtime — about $9,000 a minute on average — that weighs on facility managers. The ancillary outcomes of that downtime — from reputational damage to even potentially endangering lives — can often be more worrisome. A recent Honeywell survey of data center managers across the United States, China, Germany and Saudi Arabia found that 93% of respondents consider lockdown monitoring — their ability to stop building management systems in the event of a problem — a top concern. Nearly as many (91%) had experienced at least one downtime incident in the past 12 months, most of them caused by human error, a cyber or physical security breach, or an unplanned outage.
Running a modern data center is no simple task. As worldwide demand for capacity continues to spike insatiably, it creates a huge scalability challenge from a design, build and operational perspective. Not surprisingly, 73% of surveyed data center managers worry about keeping up with growing capacity needs, and 72% cite their ability to predict or quickly identify problems as a chief concern.
There is a way to help solve these problems: better integration. Many data centers work to operate at 99.999% uptime and need solutions that allow them to minimize the risks of downtime while meeting cost and energy efficiency, without adding redundant systems or processes. Today, most of them must manually collect information from disparate systems, making holistic monitoring difficult if not impossible.
Improving critical asset integration — across both information technology (IT) and operational technology (OT) — can give data center managers better situational awareness and deliver actionable intelligence. This enables them to quickly identify the root cause of an incident and spend more time improving their operations, ultimately maximizing uptime.
Case in point: Honeywell is piloting a site-level supervisor solution in one of its own data centers that provides this holistic visibility of data center operations. The solution is helping to improve overall situational awareness by aggregating mission-critical information in a single, centralized dashboard. It also allows operators to reduce response times because they can quickly diagnose alarms, trace root causes and execute workflows to triage a situation. Overall, preliminary data show a 10% gain in productivity since installing the solution.
Given the exponentially increasing demand for computing, data center managers are often forced to do more with less. Data centers help to automate our daily lives: Shouldn’t data center facility managers have tools that help them automate as many processes as possible? Learn more about Honeywell data center solutions here.
- Uptime Institute, Annual outage survey 2021. Andy Lawrence, last updated April 16, 2021. [Accessed October 18, 2021]
- Honeywell Building Technologies report, Rethinking data centers as resilient, sustainable facilities. 2021 Building Trends Series, October 5, 2021. [Accessed October 18, 2021]