Considered by Gartner to be a "megatrend" for 2008, VMware software is making its way into data centers of every size. Increasingly, this virtual infrastructure software is being tapped by organizations to increase the efficiency and cost-effectiveness of their IT operations. Gartner also sees virtual technologies improving IT resource utilization and increasing the flexibility needed to adapt to changing requirements and workloads.
But the use of virtual technologies also raises new data protection issues. For example, what are the best solutions for protecting virtual machines? Is "on-host" backup to be preferred to "off-host"?
This article looks at different backup configurations in some detail. It also explores the importance of data deduplication in meeting today’s evolving backup and recovery requirements.
Data protection is an area where virtualization has radically changed the backup paradigm. Standard backup technologies that have been used for years do not translate well into the virtual world.
Basically, there are three backup types associated with VMware: 1) backing up via a client inside each virtual machine; 2) backing up via a client inside the VMware Service Console; and 3) an off-host technology based on VMware Consolidated Backup. Let’s consider each configuration.
- Backup client inside each virtual machine: In spite of the virtualization technologies involved, virtual machines (VMs) are complete operating system installations hosted on virtualized hardware. These installations can be backed up using the same basic techniques as their physical counterparts – with a backup client inside the Guest OS. Running a client inside the VM is supported. Standard OS support rules apply. Backing up a VM in this way is essentially like backing up a physical machine.
- Backup client inside the VMware Service Console: This could be considered an off-host backup technology in the sense that no backup software is installed inside the virtual machine. Installing the client inside the Service Console gives direct access to the files that make up the VMs – the vmdk files. This method is easiest to implement if the virtual machines are powered off. In this state, the virtual machines are static and unchanging. If the VMs are powered on, additional pre-backup processing or scripting would be recommended to ensure that the VMs are in a consistent state during backup operations. Implementing this would involve using the host ESX Server's built-in snapshot functionality
- Off-host using snapshot client: With the introduction of VMware Virtual Infrastructure 3 (VI3), off-host backups of virtual machines are now possible. Frequent backups with minimal impact on the host ESX server are the Holy Grail of ESX server backups. The enabling technology for this is VMware Consolidated Backup (VCB). Part of VI3, VCB orchestrates the virtual machine snapshot and transfer to the off-host client. The advantage of off-host backups is that the impact of backup processing on the ESX server and hosted virtual machines is significantly reduced. This allows for more frequent backups. VCB provides the ability to backup the underlying virtual machine vmdk files.
Some backup solutions either back up only at the vmdk (entire virtual machine) level or require two backup passes to be able to perform single OS file restores as well as vmdk restores. Generally speaking, most organizations will want to keep their options open when bad things occur. For example, if a virtual machine is infected with a virus or inadvertently damaged due to user error, a single file restore is of little use. The entire virtual machine needs to be restored. But if a user needs to recover a single deleted file (the most common type of restore operation), restoring the entire virtual machine is overkill and requires downtime.
The solution, of course, is to make possible either type of restore – single file or vmdk (entire virtual machine) restore while retaining the performance advantages of an off-host backup and a single backup pass.
For many enterprises today, continuous data growth rates of 30% to 50% per year are placing a terrific strain on the backup process. In most cases, available network bandwidth just can't keep pace with data growth. Meeting backup windows continues to be a challenge, not only in the data center but beyond it as well.
Given such an environment, it's no surprise that enterprises are increasingly exploring disk-based backup solutions that reduce the size of backups (and the network bandwidth required to perform them) by using data deduplication technology. Data deduplication technology allows customers to cost-effectively substitute disk for tape-based backups.
In general, data deduplication involves looking for redundant instances of backup data at a sub-file or block level across all backup data and all locations, thereby allowing companies to reduce the amount of storage needed for backups. Data deduplication technology can also enhance disaster recovery by reducing the bandwidth needed to transmit large volumes of data between different sites.
Symantec estimates that data deduplication deployed across the data center, virtual environment, and remote offices can reduce network bandwidth required for daily full
backups by up to 500 times, and reduce total storage consumed from backups by 10 to 50 times.
VMware virtual infrastructure software is being used by enterprises large and small to increase the efficiency and cost-effectiveness of their IT operations. But as innovative as virtual machine technology is, it also introduces new data protection issues. Organizations embracing VMware virtualization technologies are strongly encouraged to consider solutions designed to provide enhanced backup and restore functionality specifically for VMware environments. At the same time, they should explore data deduplication technology for use in the virtual environment, data center, and remote offices.