Blog Backups

Are Backups Still Usefull

 

All of us ICT professionals have been brought up with the notion that we should regularly take backups of all our data collection. This should been done to be able to restore the data collection in case something went wrong. But is this still true?

 

From an architectural point of view, I distinguish several situation in for which we are taking back-ups, and, in this blog, try to find an answer of the usefulness of the backup.

 

But first, let's take a look at the current situation of the application and data landscape as it is unfolding in many organisations. Every organisation has different applications in place to support different parts of the end to end processes within the organisation. Bigger organisations have even more the one application for a specific task, ask there are independent stakeholders, having their own specific requirements. The latest developments we see is Cloud computing (SAAS driven solutions) with their own off-premise data collections. In the architecture we have been busy linking these applications via batch interfaces, ESB, messages, API calls, triggers and so on, to support the user processes. And, from a user perspective but also from a data management perspective, we have achieved big successes by reducing the number of manual input transactions, creating continuity and uniformity on data element level and (working towards) achieving the "one time only" and "one truth" principles. "One time only" is for my the principle the user organisation only has to enter a data element once, thereafter automated propagation through applications and processes is automated. If done correctly, this also means "One truth" is achieved: when reporting on a data element, every application talks the same language. (I admit, I am taking a shortcut here)

 

So from an architectural point of view, we are always talking about a chain of data collections, which at all times, should remain in sync.

 

Now, back to the question at hand: Are backups still useful?

 

So first we have to answer the question, for what purpose are we taking the backup:

1.To restore the data in case a conversion goes wrong (technical reason, short term restore)

2.To prove that, at a certain point in time, our data collection was in a certain state (especially for the accountants)

3.To restore the data in case a mechanical error occurs (disk failure)

4.To restore the data in case a logical error (application or SQL related)

 

Ad 1:

In these cases we usually freeze the environment, take the backup, perform tests and find out whether or not the conversion was successful or not: if not we restore the data collection (and application) before unfreezing. So far so good, in these cases there is no problem restoring as no external activities has made changes on the data collection. After tests have been run, the backup can be discarded, as (from a processing point of view) can all previous back up as they do no longer match the data schema.

 

Ad 2:

This is a valid reason to create a backup and keep it. However when we need to provide proof, we will not restore the data collection in our production environment, but we restore in a dedicated temporary environment.

 

Ad 3:

This is one of the traditional reason to take backups and keep them at remote location, fireproof vaults and so on. When designing the backup schedules we usually also incorporate some KPI's like max 1 days work lost of 6 FTE (old school approach). Nowadays we even are able to keep the transaction logs and based on those logs restore all transactions. However, in case these transactions activate some triggers to send data / perform action within another environment, we introduce duplicate actions - making the data inconsistent.

Other and better solutions to resolve these issues are the High availability and Disaster recovery solutions provided by the main DBMS platforms, however due to the cost, they are not used for all data collections.

I am no longer convinced that backup and restore processes are suitable to solve these kind of issues /outages.

 

Ad 4:

These are the hardest problems to solve. Usually to solve this, we start by (partially) restoring a backup from just before the logical problem started and then try to spool forward all transactions. This approach takes a lot of resources and often breaks the data collection chain created. I do not have an answer on how to solve this, but (partially) restoring a backup does not seem the right way to go

 

So my question "Are backups still useful" and if so how should we use them for the last 2 mentioned scenarios.