1. /
  2. Confident Insights Newsletter/
  3. Data Deduplication: the Smart Choice in Virtual Environments

Data Deduplication: the Smart Choice in Virtual Environments

March 12, 2009


Now more than ever, IT organizations need to look for ways to control costs, improve productivity, and limit exposure. In this tough economic climate, optimizing backup and recovery operations is a crucial element for survival.
It may be hard to believe, but there was a time not that long ago when administrators could routinely shut down critical systems to perform backup operations.

How the world has changed.
Today, explosive data growth rates of 50% or more per year are forcing all companies, regardless of size, to rethink their backup processes. In addition, critical business data is expected to be available 24x7. And evolving regulatory mandates are driving ever more stringent requirements about how businesses back up and restore their data.
At the same time, companies are increasingly turning to virtualization. Virtualizing the data center brings many benefits, including higher server utilization, administrative efficiencies, increased server mobility, and easier disaster recovery.
But virtualization also introduces new backup challenges. The proliferation of virtual servers can make backup configuration more time-consuming, virtual machine “sprawl” increases storage requirements, and someone has to administer and manage all those virtual machines.
This article looks at some of the key challenges posed by server virtualization implementations, and how disk-based data deduplication offers a superior virtual server backup solution.

Virtualization sweeping through data centers

There’s no denying that virtualization has become a pervasive technology that is sweeping through data centers of all sizes. According to a December 2007 Enterprise Strategy Group (ESG) survey of current and planned server virtualization users, 81% of current server virtualization users were running production workloads and 46% were running mission-critical applications.
But as ESG recently observed, “protecting virtual machines and all of their contents needs special consideration as the mobility of virtual machines across physical resource pools makes them difficult to track. As it is not bound to physical resources, the virtual machine has the ability to traverse the virtualized infrastructure.”
In addition, using the traditional approach to backing up a physical machine in a virtualized environment places significant strain on I/O and the local host CPU. Backup is one of the more bandwidth- and resource-intensive processes in the data center. Having backup jobs running simultaneously on multiple virtual machines occupying the same physical host could have a disastrous effect on performance.

Protecting the virtualized enterprise

When considering which backup approach is best for the virtual environment, it’s important to understand that there is not a one-size-fits-all solution. Often a hybrid approach, leveraging multiple techniques, makes sense. As ESG noted, “the decision-making criteria boil down to the workload, the criticality of data, the recoverable state of the environment, and application consistency.”
A growing number of organizations are embracing disk-based backup solutions to improve backup performance, eliminate tape media management issues, and improve the speed and reliability of recovery operations. With the same data being backed up over and over again, data deduplication can be used to drastically reduce the capacity of data transferred and stored.
Data deduplication is a key disk-based technology that enables companies to eliminate duplicate backup data and significantly decrease their storage (and in some cases their bandwidth) consumption. Data deduplication can lead to data reduction ratios of 20:1 or more over time with no data loss, according to a recent study by the TANEJA Group.
Veritas NetBackup PureDisk uses data deduplication and integration with Veritas NetBackup to enable storage-optimized data protection for data center, remote office, and virtual environments. PureDisk deduplication technology results in only new unique data being backed up. This significantly decreases the I/O associated with backups and the amount of backup network traffic generated. The deduplication engine can be deployed within NetBackup or independently using a PureDisk client.
Other benefits include:
  • Increased ROI of disk. Organizations can leverage PureDisk deduplication to dramatically reduce space on standard disk, allowing months of backups to be kept online.
  • Reduced disaster recovery cost. Organizations can cost-efficiently replicate data to another data center, saving on tape shipping and storage.
  • Enhanced high availability. Using online failover for PureDisk environments with multiple servers, if the server that controls PureDisk experiences an outage, failover can be executed in minutes.
According to the Enterprise Strategy Group, deduplication in a virtualized environment has “a dramatic effect” on the amount of data that is maintained for backup purposes.
“Virtual machines contain an amazing amount of duplicate data, starting at the operating system. For example, if ten virtual machines are running Windows Server 2003, there is a tremendous amount of duplicate data between them. Why store it multiple times? As environments scale, the amount of duplicate data increases rapidly—anything that can be done to help save on storage costs and improve IT efficiencies deserves special consideration.”
According to recent Symantec research, data deduplication can result in a 10x to 50x reduction in backup storage. (Actual storage and bandwidth reductions may vary based on data type, backup types—full, incremental, and differential--and daily change rate of data.)


IT managers and executives are in a tough spot. Cost reduction is a non-negotiable objective this year, while user expectations remain high and demand continues to rise. It’s no surprise, then, that Symantec’s recently released 2008 State of the Data Center report found a flurry of activity is being aimed at server virtualization. Companies want to decrease server “spend,” but they don’t want to disrupt the computing environment.
With more and more mission-critical applications being deployed in virtual environments, companies report they are realizing solid benefits. At the same time, server virtualization adds complexity to the data center, and that’s necessitating a re-evaluation of the way companies back up and restore their data.
Symantec offers enterprises an extensive portfolio of solutions that work in both physical and virtual environments. By leveraging NetBackup PureDisk’s data deduplication as part of their overall backup strategy, companies can significantly reduce storage and bandwidth consumed from disk-based backups.

Related Links

Back to Newsletter