Veeam: Active Full Backup vs Synthetic Full Backup

[Total: 6    Average: 4.8/5]
Hello everyone! I come back to talk about Veeam, where we will perform a comparison between Active Full Backup vs Synthetic Full Backup . In this comparison we are going to check the differences between both backup methods  from the standpoint of capacity, and of performance impact  in Production Storage and Backup Repository.
First let’s look at how each method Backup works

Active Full Backup

Every time you configure a Backup Job with Veeam, with independence of the backup method set, the first execution of the Job always generates a full backup of the VM associated with the Job, which corresponds more specifically to a Active Full Backup.
Similarly, when we set up a Backup Job with Forward Incremental method, we can set up a periodic full backup so we can prevent the chain of backups become too large, and thus further facilitating the enforcement of retention policies set. ( See previous post where we speak in detail about these policies) .
An Active Full Backup produces a complete backup of one or more VM, where Veeam obtains data for this backup entirely  from the Production Storage in which the VM we want to backup is hosted.  This data is then compressed, deduplicated and stored in the selected repository, in a file with extension .VBK.

NOTE: An Active Full Backup generates only sequential writes into the Backup Repository, so it is a good method when the Backup Repository performance is constrained .

Let’s see the process of creating an Active Full Backup on a Job with daily incremental backups, where the Active Full Backup is set to run on Saturday:
  • The first run of the Backup Job will generate an Active Full Backup, getting all the data from the Production Storage.

  • During the following days, the Job will generate incremental backups as configured in the Job.

  • On Saturday, as scheduled, a new Active Full Backup is generated, obtaining again all the data from the Production Storage.

One point that you must also take into consideration is that every time you trigger a new Active Full Backup in a specific Backup Job, the current incremental backup chain is reset . That is, all new incremental backups generated after the creation of Active Full Backup will use this Backup as a new starting point in the backup chain.  The previous backup chain (Full Backup + Incremental Backups) remains in the Repository until it is automatically deleted according to the retention policy configured in the Backup Job.
You can configure a Backup Job to run an Active Full Backup periodically.  Similarly, you can request the creation of a new Active Full Backup in a Backup Job at any time, running it manually from the Veeam Backup Server console.

Synthetic Full Backup

We saw that in an Active Full Backup, all data of the VM we want to back up is obtained from the Production Storage (Datastore). Sometimes, running an Active Full Backup periodically is not an option, mainly because it puts heavy load on the resources of the productive infrastructure, as well as consume a significant amount of bandwidth in the SAN or LAN (depending the transport method used)
On the other hand, a Synthetic Full Backup uses the data that is already in a Backup Repository to “synthesize” a Full Backup without having to connect to Production Storage (Datastores) in order to get all the data required to create the backup. In this case, Veeam uses the existing backup chain on the Repository (Full Backup + Incremental Backups), consolidates data of VMs from this backup chain, and writes the consolidated data into a new Backup file (VBK).
If we see it from the backed up data point of view, a Synthetic Full Backup is identical to an Active Full Backup, in fact in both cases we get a .vbk file containing all the data of backed up VMs. The difference between a Synthetic Full Backup and an Active Full Backup is basically restricted to how the data to be backed up is obtained .
NOTE : The first backup of any Backup Job is ALWAYS an Active Full Backup , regardless of the configuration of periodic Synthetic Full Backups.
Using Synthetic Full Backup has some considerations:
  • Because most of the data is obtained from existing backups in the repository, the use of network resources (SAN or LAN) is reduced. Remember that an Active Full Backup obtains all the VM data from the Datastore using the SAN or LAN, depending on the transport mode used.
  • Similarly, because you don’t have to get data from the Production Storage, the impact over this Storage is reduced during backup operations, because most of the data will not be obtained from this Storage.
  • On the other hand, a Synthetic Full Backup creates a greater impact on the Backup Repository, so, if the performance (IOPS / Latency) of the storage device used to store backups is limited (which happens very often), the creation of a Synthetic Full Backup could take a considerable amount of time ( See post about the Impact of backup methods in performance ) and also affect the performance of other Backup Jobs that are running at the same time as the Synthetic Full.
  • Also the creation of Synthetic Full Backups using Deduplication appliances as a Repository is not recommended, because these kind of appliances are not optimized for read operations, and a Synthetic Full Backup requires to read the previous backup files to “synthesize” a new full backup.
    • In a deduplication appliance, in order to read a data block, this must be re-hydrated, this means the appliance must undo the deduplication process applied to this data block.
    • This generates a greater latency on read operations, and as a result, the required time to create a Synthetic Full Backup rises dramatically.
  • The exception to the previous point is if you use a Deduplication Appliance with native integration with Veeam:
    • EMC Data Domain with DDBoost
    • HPE StoreOnce with Catalyst
    • ExaGrid.
Let’s see how the creation of a Synthetic Full Backup takes place on a Job with daily incremental backups, where the Synthetic Full Backup is set to run on Saturday.
  • The day that Synthetic Full Backup is scheduled (Saturday), Veeam will run the Backup Job on a regular basis to generate the incremental backup scheduled to be created this same day.
    • During the execution of the Job, Veeam will create this incremental backup in the regular way, this mean it will obtain the data from the Production Storage (Datastore), creating a new incremental backup in the existing backup chain (VIB).
    • This incremental backup allows Veeam to make sure the Synthetic Full backup includes the latest changes in the VM that we are backing up.

  • After the Job session ends, Veeam will build a new Synthetic Full Backup using the backup files that are already available in the repository, also adding the data obtained as part of the incremental backup indicated above.

  • Additionally, in this step the incremental backup file (VIB) is eliminated, and instead only the Synthetic Full Backup (VBK) remains, which creates a new backup chain.

  • The new incremental backups created later by the Backup Job are associated with this new Synthetic Full Backup in the same backup chain, until a new full backup (Active Full or Synthetic Full) is created.

 

Veeam treats Synthetic Full Backup in the same way that an Active Full Backup. Thus, when we ask for a Synthetic Full Backup on a Job, this will reset the backup chain, which means that all subsequent incremental backups are associated with this new Full Backup created “synthetically”. The previous backup chain (Full Backup + Incremental Backups) remains in the Repository until it is automatically deleted according to the retention policy configured in the Backup Job.

To create a Synthetic Full Backup, you have to simply enable the “Create synthetic full backups Periodically” option on a Backup Job, also indicating when (schedule) this Backup should be created.

Wrapping up

As we have seen, both types of backup allow us to generate a Full Backup, containing exactly the same information from the Backed Up VM, and resetting the backup chain. The difference lies in how the data is obtained, and the impact that this generates on storage, either on Production Storage or on the Backup Repository.
Active Full Backups are good at occasions where:
  • The backup repository performance is constrained, so that the additional load generated by creating a Synthetic Full Backup creates a negative impact on the repository and on the time required to complete backup operations.
  • Backup repository corresponds to a deduplication appliance , which doesn’t correspond to those with native integration with Veeam (EMC Data Domain with DDBost, with StoreOnce Catalyst HPE and ExaGrid)
By contrast a Synthetic Full Backup has advantages in other situations where:
  • The performance of Production Storage is an important concern, so you should minimize the impact of running backups in order to avoid problems over the company business services.
  • The available bandwidth in the SAN or LAN (Depending on the transport mode used) is limited, which requires reducing the amount of data transmitted during backup operations.
    • This point is also important when you need to head a backup to a remote location where bandwidth is limited.
To summarise, both are valid alternatives when making a full backup, but you should properly choose the method to use depending on the characteristics of the platform that we are protecting, and the implemented backup infrastructure.

About the Author:

EnglishPortugueseSpanish