We don’t like to say the ‘D’ word out loud, but power outages are the main cause for dreaded ‘downtime’ in data centres, usually caused by overheating or equipment failure.
Wherever the outage originates from, the end result is always the same – data loss, damaged files, and destroyed equipment, meaning significant, sometimes catastrophic, losses of money and thousands of unhappy users.
Last summer, you may have noticed reports of data centres struggling during heatwaves and suffering painfully disruptive downtime. The Uptime Institute found that downtime was regularly costing data centres over £1 million. (https://dcnnmagazine.com/data-centres/data-centre-outages-crestchic-loadbanks/)
So, how can increasingly expensive and damaging downtime be prevented?
Data centres now build redundancy into their infrastructure, allowing critical systems to continue running in the event of an outage.
What is Data Centre Redundancy?
In short, data centre redundancy involves the duplication of critical components of a system in order to improve reliability. A bit like a back up. Data centre redundancy tends to focus on how much spare power can be used as a back up during a power outage.
How can data centres plan a sufficient amount of redundancy in the event of unforeseen power outages?
Large businesses keep their servers in Tier 3 and Tier 4 data centres, which offer high performance and uptime guarantees compared with Tiers 1 and 2. However, each tier offers differing levels of redundancy systems. Tier 3 usually offers N+1, while Tier 4 will provide 2N or 2N+1.
How Many N’s Do Data Centres Require?
In simple terms, N is the measurement of the amount of redundancy equipment needed to keep a data centre running.
N+1 is also called parallel redundancy and ensures that an uninterruptible power supply (UPS) system is always available. It’s like having one extra backup server for every ten so that in the event a primary element fails and requires removal for maintenance, an additional component starts running. N+1 backup solutions operate for a minimum of 72 hours in the event of local or region-wide outages. Yet it is not a complete fail-safe, as it runs on one common circuit, rather than its own separate feed.
How about 2N? Also known as N+N, this offers a fully redundant, wholly mirrored system with two independent systems so that in the event a primary component fails, an identical standby replica can stand in to continue operations.
And to really cover all bases, 2N+1 offers double cover plus one extra piece of equipment, so in the event of an extended outage, there is an extra backup component to cover a failure when the secondary system is running.
Is Data Centre Redundancy the Only Solution?
Having a redundancy system in place is crucial for data centres, but there are other ways to prevent outages.
As we mentioned earlier, outages can be triggered by warmer weather. Many data centres look into alternative cooling methods, whether that be air cooling, liquid cooling or even migrating to cooler countries.
Having a handle on cooling can not only prevent downtime but it can also save huge amounts of money considering the vast amounts of heat data centres produce.
Improper data centre infrastructure management (DCIM) and unreliable processes can lead to overheating, a leading cause of outages, wasting energy and money. When data centres don’t have accurate data about their systems and where obsolete equipment is located, they can even start overcooling.
Drops or surges in power can cause the shut down or damage of servers.
UPS (Uninterruptible Power Supply) systems should be in place, but there is still the possibility of disruption to cooling systems which could lead to overheating.
Electrical usage sensors can track power consumption by the rack, enabling replacement of obsolete servers with more efficient equipment.
Outages can also occur from human error and inaccurate maintenance and management of servers. Data centres can easily prevent such occurrences with DCIM that streamlines processes and even predicts when outages are about to happen, allowing faster reactions and prevention.
What else can DCIM do to prevent downtime? And can it work alongside data centre redundancy?
The Ultimate Team – DCIM and Data Centre Redundancy
On its own, data centre redundancy is a data centre saver. Likewise, used in isolation, DCIM offers innovative methods of measuring, monitoring and managing data centres, producing optimised environments and increased uptime.
Put the two together, and you have the benefits of a backup bolstered by the advantages of smart prevention software.
Other than cooling, UPS and streamlined processes, how does DCIM help ensure uptime? It offers complete visibility of consumption data, whilst trending historic data offers straightforward planning for future energy consumption.
It’s time to stay ahead of the risk of downtime with a fully efficient and optimised data centre. Implement Assetspire’s next-gen DCIM to monitor your assets and detect any potential issues before they become big problems, minimising disruptive downtime and data loss and saving money.
Gain real-time insight into capacity, current and past utilisation and management of energy sources. Assetspire’s smart solution to DCIM software provides a full and accurate overview of all assets, so you can see exactly where the energy wasting obsolete and outdated equipment is located, offering the insight to be able repurpose older assets or replace them and prevent downtime.