RPO and RTO impact on sizing

This article explains how to apply the RPO and RTO concepts in business continuity planning.

The recovery point objective (RPO) and the recovery time objective (RTO) are crucial factors when selecting a strategy, choosing technical solutions, and sizing your system in order to ensure business continuity.

Recovery point objective (RPO)

How much data can my business afford to lose?.

The RPO is the foundation in any Exasol business continuity plan, because it defines the point in time that you will be able to restore your system to after an outage. Any data after the recovery point will be lost.

An important factor for determining the RPO is how your system handles input data, specifically whether or not the data can be stored for a period of time. There are essentially two scenarios to consider based on how input data is handled:

Input data is not stored

All data that was generated between the start of the last successful backup and the disaster event is lost. The database will be restored to the same state that it was when the last successful backup started.

Business continuity plan exasol

The RPO determines how much data can be lost, and therefore determines the maximum amount of time that can pass between the start of the last successful backup and a disaster event. This in turn determines how frequently backups should run.

Input data can be stored

In this scenario, input data can be stored for a certain amount of time, for example in external systems. This means that there is an added step for recovery: once the last backup has been recovered, all unloaded input transactions also need to be loaded in the database.

Business continuity plan exasol

In both these cases, the maximum backup age can be calculated using two factors:

The amount of time that input transactions can be stored.
How long backups take to be restored.

The following equation can be used to calculate the maximum backup age:

Max Backup Age = Input Data Storage Time - Backup Restore Time

For example, if input data can be stored a maximum of 8 hours and a backup restore takes 2 hours to complete, then the maximum backup age should be 6 hours.

The total backup restore time includes the time needed to access the backups and for starting the backup recovery process.

For both scenarios, the shorter your RPO is, the less time can pass between backups. This could mean greater backup frequency, or even instant data mirroring. Which strategy you choose must be taken into account when dimensioning the hardware and network for your Exasol system.

Recovery time objective (RTO)

What is the maximum amount of time that the system can be down or running with limited capacity?.

For an Exasol database, the RTO is the duration of time from the point of disaster until the last successful database backup is restored. This time period includes the time it takes to access the backups, as well as time for the restore itself. If you are restoring several backups - such as a full backup and several incremental backups - the entire time for the restore must be taken into account.

Depending on your backup strategy, it might be possible to do a non-blocking restore of an Exasol database. Although this is slower than a blocking restore, a non-blocking restore makes it possible to use the system at a reduced capacity while the data restore is taking place.

The time taken for decision-making after a disaster (such as deciding whether or not to do a restore) should not be part of the RTO.

How RPO and RTO interrelate

When you devise any business continuity plan, it is important to understand how all the involved variables are interrelated. The following is just an example of how the previously mentioned factors may affect each other.

RPO and RTO: These two concepts are closely linked. The RTO determines the maximum amount of time that you have to restore the last successful backup. This influences the RPO, which determines the maximum database age.
RPO, RTO, and backup strategy: The RTO takes into account the amount of time required for a backup restore. The restore time depends on the size of the backups, which in turn is determined by the backup frequency, and by how many backups (full and incremental) there are to restore. A shorter RTO means that backups have to be more frequent, and that the RPO is not so far back in the past.