Do you have your hands full dealing with the increasing amounts of data generated by your company? You’re not alone. Both IDC and the Enterprise Strategy Group predict data growth will be in the 50% to 60% range for years to come.
But that’s not the end of the story. As enterprises continue to grow via mergers and acquisitions, data is no longer confined to a single data center. It’s now spread among multiple data centers, remote offices, and even virtual environments.
Given this state of affairs, your organization can’t afford to back up and store unnecessary, redundant information. You need to consider deduplication.
Although data deduplication technology has existed for more than five years, many
organizations have yet to take advantage of the operational and storage efficiencies to be gained through deduplication.
This article looks at the solid benefits that accrue when you deduplicate data in remote offices, virtual environments, and the data center.
By necessity, companies operating from different local offices must distribute parts of their IT infrastructure over these remote sites. Local documents, emails, presentations, and so forth are kept on local file servers primarily to improve network performance and to allow rapid recovery in the event of data loss.
But this practice raises several important questions:
- Can administrators guarantee that backups are being performed according to policy or even performed at all?
- Are the backups successful?
- Are skilled resources available to troubleshoot errors?
- Are tapes stored securely, protected from potential harm or loss?
As many studies and onsite audits have observed, the backup of data at a remote site is often executed by non-IT personnel. Sometimes the wrong tape is inserted and there is no one else present to check it. The response to system errors is often incorrect and is not reported to a central IT authority. Worse, when the tape loading process fails, no backup is executed. As the remote employee is often not qualified to verify whether the backup has been successful or not, no one really knows if the tapes contain the correct data.
Symantec NetBackup PureDisk helps eliminate these bandwidth and tape-related issues by combining disk-based backup with data deduplication. PureDisk works at the source to eliminate data redundancy before it traverses the network and enters the data center.
With its unique fingerprint technology called global unique file identification, NetBackup
PureDisk technology distinguishes unique files from redundant copies across the enterprise. Significant savings in storage capacity and network traffic can be achieved by not transmitting and storing redundant data.
For example, a backup across three remote offices of the same 2 MB Word file on three file servers would result in 6 MB of capacity used in conventional approaches. NetBackup PureDisk, however, stores a single copy only and consequently needs no more than 2 MB of storage capacity—a 66% savings. Comparable results are seen for
throughput: The backup would be completed about 66% faster across the three remote sites.
While many IT organizations have benefited from virtualization technology, particularly by simplifying server management and reducing operating costs, they have also encountered some new challenges.
For example, when running servers in a virtual environment, it’s extremely easy to create new virtual machines. In fact, it’s so easy that you may run the risk of “VM sprawl,” which increases management costs. And growth in the number of virtual machines means that more storage gets used. Also, if a lot of backups are taking place on the VMware server, there’s a risk of overloading it, since it’s already running at a high utilization.
NetBackup speeds the backup and recovery of information stored on virtual machines. Symantec’s patent-pending Granular Recovery Technology enables NetBackup to leverage a single pass image backup of the entire guest virtual machine in order to store backup information once and recover anything – including the entire virtual machine, individual virtual disk files, as well as individual files and folders inside virtual disk files.
The challenges posed by rapidly growing VMware environments can be overcome even further by integrating technology within the backup process that performs data deduplication. PureDisk protects virtual machines by reducing the size of the backup data across virtual machines. A PureDisk agent can also be put inside an individual virtual machine, reducing the data at the source before it is sent over the network.
In the data center, the biggest challenge that backup admins encounter has to do with the enormous amounts of data that needs to be backed up on a daily basis within a backup window that stays constant or keeps shrinking. In addition, increasingly aggressive RTOs (Recovery Time Objectives) and SLAs (Service Level Agreements) mean that data needs to be recovered with minimal downtime.
Given these realities, tape backups are unlikely to meet the needs of most enterprises. That’s why a vast majority of companies are exploring some form of disk backups in their environment.
When data centers use NetBackup with PureDisk deduplication, they use their disk capacity more efficiently by eliminating redundant data and allowing more versions of the data to be retained on disk for longer periods of time. That’s essential for supporting stringent RTOs and RPOs (Recovery Point Objectives).
Let’s take a closer look at the ways in which deduplication improves the backup process:
- First, it reduces storage consumption for backup and archiving applications. With deduplication, Symantec has seen storage space cut by a factor of 10 or more. Symantec has also seen customers able to cost-effectively store 30 days of data on disk in the data center for quick recovery.
- Second, it helps to reduce bandwidth/replication costs, which allows for consolidation and disaster recovery. Deduplication can cut backup time by a factor of 10, and so help ease backup window pain. Deduplication is considered “next-generation incremental” because once it does the first full backup, it captures only incremental changes thereafter. It can also help to reduce LAN/WAN traffic because it only sends that changed data at pre-configured, scheduled intervals.
- And third, as a disk-based technology, deduplication helps to reduce reliance on tapes, making backups more reliable. By leveraging backup image replication between major data centers, you can also replace tape shipping costs.
Continual data growth. Shrinking recovery times. Flat or declining IT staff counts and budgets. Regardless of the size of your organization, these issues have moved to the forefront of corporate concerns. Perhaps it’s no surprise, then, that data deduplication has become such a hot topic. By eliminating duplicate backup data and significantly decreasing storage and bandwidth consumption, data deduplication enables enterprises to effectively gain control of rapid data growth.