In the previous blog in this series (see links below) on Power of NetBackup Deduplication, we talked about two special powers of NetBackup deduplication, viz. how dedupe processing can be distributed and how backups are securely streamed. Now let us talk about two more exciting differentiators.
Application aware deduplication
The technology in NetBackup Appliances for data reduction is NetBackup Deduplication. Unlike third party vendor solutions where all backup streams are treated the same way in an effort to identify duplicate data with excessive processing overhead, NetBackup Deduplication understands the backup streams. NetBackup Deduplication uses the normalized stream to identify data type, detect file boundaries and does deduplication with less resource overhead. For example, a backup stream from a NetApp filer coming in ufsdump (NDMP backup) format is identified using a deduplication stream handler that can individually process the file objects in the stream rather than treating it like a giant blob of data. This increases the efficiency and scalability of deduplication process.
Let us explain this with an analogy quoted by one of the distinguished engineers in Symantec. The way NetBackup Deduplication handles various data streams can be compared to how maps are used for navigation from point A to point B. As the stream is generated by application aware NetBackup component (e.g. NetBackup for NDMP in case of NDMP data streams, NetBackup for VMware in case of VMware backups etc.), the map is already available to identify the objects and segment them to look for unique pieces of data. The third party end point deduplication solutions are using bulldozer methods. All that it sees is a is blob of data which needs to be broken down and massaged multiple times at different segment sizes to improve data deduplication performance. Think of it as if building roads each time rather than following preexisting roads using maps. The bulldozer method requires considerable amount of resources and it is normally achieved by increasing the processing power and memory of the target appliance.
Deduplicate globally across physical and virtual machines
As stated earlier, NetBackup Deduplication is application aware. Whether duplicate pieces of data are sitting on multiple standalone physical hosts, on NAS filers, on virtual machines disk files in VMware or in virtual hard drives in Hyper-V; NetBackup uses the appropriate stream handlers to identify file objects and segments during deduplication. The result is data reduction across physical and virtual machines.
Blogs in this series
The power of NetBackup Deduplication: Application awareness and global deduplication (this blog)