Skip to content

What is Failover?

A failover allows a standby system to take over automatically when a main system stops working, due to a disaster or major disruption. Failover is an essential component of all disaster recovery plans.

Why is Failover Important?

With a proper failover, businesses can keep their downtime and data loss to a tolerable minimum (and align to their Recovery Time Objective (RTO) and Recovery Point Objective (RPO). For every hour your organization experiences downtime, business continuity is at risk; you lose revenue, productivity, and brand trust that you may not be able to recover. The same goes for data loss. If systems go down at an inopportune moment, the consequences of losing that data could be irreversible.

 

Failover works to minimize impact from negative events. At a moment’s notice, your processes can automatically switch over to a redundant system, allowing you to return to business as usual.

 

The main reasons organizations have a failover plan are to ensure they have systems that are perpetually available plus many may explore failover options that meet compliance standards based on industry or company requirements.

How Does Failover Work?

Most of the time, a failover operates automatically; however, it can be done manually. The biggest downside to this is that any systems that count on human intervention to work will be inherently less reliable.
 

When performed as an automated process, failover relies on a signal to know when to switch over to the backup or redundant system. This is commonly done with what’s called a heartbeat monitor, a device that sends signals to both the primary and secondary systems regularly. If the heartbeat monitor doesn’t receive a signal from one of the systems, it is programmed to switch to the other system.

What is a Failover Cluster?

A failover cluster provides strength in numbers. A group of computers works together in a cluster to offer high application and service availability to the business. With multiple servers, if one fails, another is there to take its place, often with no downtime. Some systems may not require a cluster, but for more critical workloads, adding more reliability with a cluster can be an attractive option.

What are Different Types of Failovers?

Failovers can be conducted by a human, a machine, or a combination, and can include hardware, software, and networks.
In each type of failover, the process looks the same, but the pieces involved are different. When a primary system fails, a hardware failover involves an automatic switch to redundant hardware, a software failover is concerned with switching to a redundant software component, and a network failover will route to a backup network connection.
As previously mentioned, failovers can be completed manually, where a human switches over to a redundant system on-demand, as well as automatically, which can be more reliable. However, if the programmed method fails, businesses can also choose to use a hybrid failover system, where both manual and automated options are possible.

What is Failover Testing?

Businesses want to be sure the failover works in times of crisis, and that peace of mind is best accomplished through testing. By simulating a failure during testing, organizations can observe how the system responds to confirm it is working as intended in the event of an actual emergency.

 

Testing regularly allows businesses to identify any problems present in the failover process, preventing them from happening when it matters most. This can be done by a software tool, or it can be triggered manually. Failover testing can include:

  • Cold failover: The primary system fails as a simulation, and the backup system response is observed to determine whether it will recover from the failure.
  • Warm failover: Unlike cold failover testing, the secondary system is already running and ready to jump in when the primary system fails in simulation.
  • Hot failover: With hot failover testing, the second system is already running and is also completely synchronized with the primary system.

Does TierPoint Ensure Failover?

TierPoint’s disaster recovery services keep failover in the foreground. Our Disaster Recovery as a Service (DRaaS) provides failover to private, multicloud, and public cloud services when main systems fail.

 

Learn more in our Strategic Guide to Disaster Recovery and DRaaS.

Related Services

Ensure Resiliency with Disaster Recovery as a Service!