Recently, there were discussions in social media and blogs on using hypervisor level snapshots on virtual machines hosting business critical applications like Microsoft Exchange. Some confusion came on account of a statement from Microsoft documented here. The emphasis is mine. http://technet.microsoft.com/en-us/library/aa996719.aspx
“Some hypervisors include features for taking snapshots of virtual machines. Virtual machine snapshots capture the state of a virtual machine while it’s running. This feature enables you to take multiple snapshots of a virtual machine and then revert the virtual machine to any of the previous states by applying a snapshot to the virtual machine. However, virtual machine snapshots aren’t application aware, and using them can have unintended and unexpected consequences for a server application that maintains state data, such as Exchange. As a result, making virtual machine snapshots of an Exchange guest virtual machine isn’t supported.”
I also got a few questions on this during VMware User Group (VMUG) conference in Minneapolis while I was talking about strategies to bring business critical applications into vSphere environments. I wanted to use this blog to clarify what the statement above really means from data protection strategy for business critical workloads on vSphere.
First let us define a few snapshot operations so as to avoid confusions. As we are talking about data protection, let us focus on how virtual machine disk files (using VMware vSphere VM snapshot as an example) are impacted by a VM snapshot operation.
When a snapshot exists and an application on virtual machine writes data to disk, that data is written to a set of redo-log files. Newly saved data continues to accumulate in the redo-log files until you take an action that affects the snapshot. Possible actions that we need to discuss are…
Delete the snapshot - When you delete the snapshot, the changes accumulated in the redo-log files are written permanently to the base disks, i.e. VMDK files. Thus the VMDK files become ‘current’.
Revert to the snapshot - When you revert to the snapshot, the contents of the redo-log files are discarded. Now the virtual machine ‘rolls back’ to the point in time when the snapshot was created.
First of all, if your backup application is using Revert to the snapshot operation, then that solution is unsupported by Microsoft as I have shown in the emphasis in the statement quoted above.
Backup Exec 2012, NetBackup 7.5, NetBackup 5220 appliance are examples of data protection solutions that do not use Revert to the snapshot operation and hence not impacted by that part of the statement.
Now we need to address application unawareness of VM snapshots, which is frowned upon by Microsoft on account of genuine concerns.
VM snapshots (aka hypervisor snapshots) are indeed application unaware. Some level of awareness can be achieved by using a VSS provider that works with VSS writers of the application but it can be quite cumbersome for large environments. As we are taking about business critical applications (the lifeblood of the organization), such sub-par solutions may have risks and hence Microsoft released such a statement.
Symantec solves this problem by providing agent-assisted backups in Backup Exec 2012, NetBackup 7.5 and NetBackup 5220 appliances. An agent sitting in the VM discovers and quiesces application as if it was an agent-based backup. This brings full-fledged application awareness and consistency needed for business critical workloads. Then a VM snapshot is created using VMware APIs for Data Protection (VADP). After that the application is released from its state of quiescence. The VM data is copied using VADP transports. Thus, Symantec provides the best of both worlds when it comes to protecting mission critical applications on VMware vSphere; agent is used for the purpose of application discovery and quiescence thereby meeting Microsoft’s requirements. Then an agentless data movement (backup) is performed though VADP!
This agent-assisted backup is currently available from Symantec for Microsoft Exchange, Microsoft SQL Server and Microsoft SharePoint. In addition to providing support for these business critical workloads, you also get any-level-recovery from these applications with a single backup. For example, if you are using Backup Exec 2012, NetBackup 7.5 or NetBackup 5220 appliance for protecting a virtualized Microsoft Exchange environment on vSphere, you get the following from a single backup.
1. Recover entire virtual machine
2. Recover individual files
3. Recover specific database availability groups or information stores
4. Recover specific mailboxes or mailbox items