In light of ongoing bugs with VMware’s API for Data Protection (VADP), I figured it worth re-visiting the topic of VADP or Agent Based backups.
VADP gives backup products the ability to kick off snapshots and use Changed Block Tracking (CBT) to allow incremental style backups which improve the efficiency of backup solutions by reducing the impact (performance, think storage, network and compute overheads) and duration (backup window).
But the problem is, there has now been several instances of VADP bugs in recent years which has meant incremental backups have lacked integrity due to the changed blocks not being correctly reported.
Here is a list of some of the VADP related issues/bugs:
- Backups with Changed Block Tracking can return incorrect changed sectors in ESXi 6.0 (2136854)
- Backing up a virtual machine with Changed Block Tracking (CBT) enabled fails after upgrading to or installing VMware ESXi 6.0 (2114076)
- Changed Block Tracking (CBT) on virtual machines (1020128)
- Enabling or disabling Changed Block Tracking (CBT) on virtual machines(1031873)
- Changed Block Tracking is reset after a storage vMotion operation in vSphere 5.x (2048201)
- When Changed Block Tracking is enabled in VMware vSphere 5.x, vMotion migration fails with error: The source detected that the destination failed to resume (2086670)
- QueryChangedDiskAreas API returns incorrect sectors after extending virtual machine VMDK file with Changed Block Tracking (CBT) enabled (2090639)
From the above (albeit a limited list of VADP related issues) we can see that there are issues related to integrity of VADP CBT as well as operational considerations (limitations) when using CBT, such as not being able to Storage vMotion and having vMotion operations fail.
So while VADP in theory has its advantages, should it be used in production environments?
At this stage I am highlighting the risks associated with using VADP with customers and where required/possible mitigating the issue.
But what about good ol’ agent based backups?
Agent based backups have a bad rap in my opinion mainly because of 3-Tier solutions and the fact backup windows take a long time due to the contention in the storage network, controllers and back end disk.
Now people ask me all the time, how can we do backups on Nutanix? The answer is, you have numerous (very good) options without using VADP (or for non vSphere customers).
Using a product like Commvault, In-Guest Agent’s can be deployed and managed centrally, removing much of the administrative overhead (downside) of agent based backups.
Then by configuring incremental forever backups, Commvault manages the change block tracking (regardless of hypervisor) and can even do source side deduplication and compression before sending the delta’s over the network to the Commvault Media Agent (ie.: The backup server).
Now since all new write I/O is written to Nutanix SSD tier, it is very likely that all changes will still be in the SSD tier when a daily incremental backup is started meaning the delta’s will be quickly read and send over the network. Why is this solving the problems of 3-Tier i discussed earlier, well its thanks to data locality and the fact Nutanix XCP is a highly distributed platform.
Because each Nutanix node has a local storage controller with local SSD, AND critically, Data Locality writes new data to the node where the VM is running, most data (under normal situations) will be read locally (without traversing a NIC/HBA or the storage network). This means there is no impact on other nodes from the backup of VMs on each node.
Due to these factors, the only traffic traversing the IP network to the backup server (Commvault Media Agent in this example), are the delta changes in a compressed and deduplicated format.
So a Commvault Agent Based backup solution on Nutanix XCP, on any hypervisor, avoids the dependancy on hypervisor APIs (which have proven in several cases not to be reliable) and ensures backup windows and the impact of backup jobs is minimal due to intelligent incremental forever style backups running on an intelligent distributed storage fabric.
In-Guest agent based backups may just be making a comeback!
Note: In y experience, Agent based backups typically provide more granularity/flexibility compared to VADP backups, for specifics speak with your preferred backup vendor.
Oh BTW, did I mention Nutanix XCP supports Commvault Intellisnap for storage level snapshots on the Distributed Storage Fabric… again just another option for Nutanix customers wanting to avoid further pain with VADP.