Video Screencast Help
Netting Out NetBackup

Nuts and bolts in NetBackup for VMware: Avoiding CBT penalty with NetBackup Accelerator

Better Backup for a Virtual World is here!
Created: 20 Jan 2014 • Updated: 22 Jan 2014 • 17 comments
AbdulRasheed's picture
+7 7 Votes
Login to vote

Let us start our technical deep dive with NetBackup Accelerator for VMware. Most of you already know that NetBackup Accelerator is designed to provide full backups for the cost of performing an incremental backup. In this case cost stands for the backup window, backup storage, client CPU, client memory, client disk I/O, network bandwidth etc. required for running backups. NetBackup Accelerator was first introduced in NetBackup 7.5 for file system workloads and became an instant hit. Answers to frequently asked questions on NetBackup Accelerator for files systems could be found here.

In NetBackup 7.6, backup acceleration comes to both VMware vSphere and VMware vCloud Director environments (more on VMware vCloud Director support in a later blog in this series). NetBackup Accelerator for VMware is a combination of three powerful technologies.

NetBackup-Accelerator-for-VMware.png

  1. Changed Block Tracking in VMware vStorage APIs for Data Protection: The change block tracking (CBT) capability in vSphere lets data protection applications like NetBackup to retrieve disk blocks from virtual machine disk file (VMDK file) that has changed since a previous pre-assigned point in time. NetBackup had supported CBT for incremental backups ever since VMware introduced that capability in vSphere 4.0. NetBackup Accelerator makes use of CBT to detect changed blocks.
  2. Symantec V-Ray: V-Ray for virtual machine is like how X-Ray is for a physical object. It is a patented Symantec technology that enables NetBackup to see into virtual machine disk files without mounting them or secretly pushing obscure agent binaries into the virtual machine. NetBackup Accelerator for VMware uses this capability to enumerate underlying files/application objects in the changed blocks. It addresses one of the weaknesses in change block tracking, detecting deleted files. 
  3. NetBackup Optimized Synthetics: NetBackup Optimized Synthetics creates full backup images without reading and shipping entire data set from source to backup storage. The changed blocks detected through VMware CBT is passed through Symantec V-Ray lens for optimization and then injected into Symantec’s Optimized Synthetics capable deduplication engine to create a full recovery point.

 

Just to demonstrate the power of NetBackup Accelerator, let me walk you through the process with an example. Imagine that you have a production Microsoft SQL Server database in a Microsoft Windows virtual machine. Let us say that you have three VMDKs in this virtual machine.

  • VMDK 1: This is C:\ with operating system and application binaries installed.  100GB capacity. 60GB is occupied.
  • VMDK 2: This is where Microsoft SQL Data files are present. 500GB capacity. 250GB occupied.
  • VMDK 3: Microsoft SQL Server logs are located here. 100GB capacity. 20GB occupied.

During the first backup, VMware CBT has nothing to offer. All blocks from VMDK files must be processed. Thanks to Symantec V-Ray, NetBackup sees into the data stream. Hence whitespaces are eliminated from being read and moved. The total capacity protected is 700GB (100+500+100) but NetBackup eliminated the whitespace at source and hence just 330GB (60+250+20) needs to be processed. This processing efficiency is achieved even before deduplication engine kicks in! The 330GB of data is deduplicated and data blocks with unique fingerprints are shipped to deduplication storage. The deduplication savings depends on whether data with similar fingerprints are already in the deduplication pool or not.

Now let us induce changes to the virtual machine.

Imagine that the MS SQL Server is hosting both OLTP and Data mining workloads. Because of OLTP workloads, data files were changed by 10%. The SQL DBA also got a request to create a new data file of size 40GB to be used as a scratch space for data mining operations. The data-mining job finished and DBA was asked to delete the scratch space.

Now is the time to run backup again. 

VMware CBT had been logging changes to all VMDK files since the previous backup. All the changes since the previous backups are returned to backup application. For VMDK 2, the changes will be 10% of 250GB plus 40GB scratch file data mining load. That totals to 65GB of changes.

Because of Symantec V-Ray, something interesting happens here that is unique in the industry. VMware CBT reports the 40GB worth of changes from the scratch file and all backup vendors would religiously backup these blocks.  However those blocks won’t be recoverable as those belong to a deleted file with no inode associated with data blocks. That is wasteful cost in doing backups to read and move unnecessary data. I call this ‘CBT penalty’. If CBT was not in the picture, you won’t be wasting energy in moving unnecessary data. With Symantec V-Ray (from an earlier blog in this series, you may already know that Veritas Mapping Services, VxMS, is the secret sauce behind NetBackup’s implementation of Symantec V-Ray) NetBackup Accelerator for VMware has the ability to detect deleted blocks by evaluating the file system inode map during this phase and hence those 40GB of data is neither read nor shipped! Only the 25GB of useful changed blocks are read and injected into deduplication engine.

The deduplication engine deduplicates the changes blocks and if fingerprints are unique, the blocks are shipped to storage. The full backup image is created inline with references to fingerprints from changed blocks and previously stored blocks.

Now imagine a highly virtualized vSphere data center or vCloud environment where these types of operations are constantly taking place. VMware CBT helps significantly in detecting the changed blocks but that is just the first step in optimizing resources needed to perform backups. In combination with Symantec V-Ray and NetBackup Optimized Synthetics, now you are able to satisfy those workloads with high RPOs and RTOs without incurring CBT penalty.

Related blogs: Nuts and bolts in NetBackup for VMware: What is new in NetBackup 7.6?

Earlier blogs in Nuts and bolts series

Comments 17 CommentsJump to latest comment

AbdulRasheed's picture

Updated to include the missing link, frequently asked questions on NetBackup Accelerator (for file systems). Thanks Nicolai for point out the missing link! 

Warm regards,

Abdul "Rasheed" Rasheed

Tweet me @AbdulRasheed127

0
Login to vote
BTLOMS's picture

Can 7.6 take snapshot of VMs and back them up using an exclude list? This would enable us to exclude large databases, temp files, page files, etc.

0
Login to vote
AbdulRasheed's picture

Hi BTLOMS, 

 NetBackup for VMware does not use a client inside the VM, so tradiational exclude list will not work. But your requirements can be met in a better way! 

  Just enable Exclude Swapping and Paging Files option in the policy. That will automatically excludes data from swap ares (Linux) and page files (Windows). When you attempt restores from a parent folder where these files sit, these files will be restored as empty files. In other words, you don't lose the location of these files when you recover! 

  In the Advanced Options for the policy, you can fine tune virtual disk selection. That will take care of your other requirements for large data files and temp files. 

  Typically, as a best practice, you are likely to keep boot and data disks seperately. Just select Exclude data disks option and backup will protect just the VM and operating system files. 

   Note: Enterprise applications like Microsoft Exchange and Microsoft SQL Server can be protected with a single pass image level backup with NetBackup. In those cases, you don't need to run seperate database agent based backups inside the VM. 

  

Warm regards,

Abdul "Rasheed" Rasheed

Tweet me @AbdulRasheed127

0
Login to vote
AGray's picture

Great article.  Question:  Would one have to enable CBT on each VM first via vSphere or does NBU handle that on the backend in order to take advantage of CBT?

Thanks.

 

0
Login to vote
AbdulRasheed's picture

Note that we supported CBT when it was made available in vStorage APIs for Data Protection (VADP) since the release of vSphere 4.0 with NeBackup 7.0. NetBackup will enable CBT automatically if you have enabled block level incremental backups (since 7.0) or NetBackup Accelerator for VMware (since NetBackup 7.6). 

 

Warm regards,

Abdul "Rasheed" Rasheed

Tweet me @AbdulRasheed127

0
Login to vote
thevmwareguy's picture

Netbackup will do that for you automatically.

0
Login to vote
NBmartena's picture

 

We only use Hyper V server 2012 R2 do I still need to use flashbackup to backup my virtual machines as the fastest way to backup my guests ?

Or is acceleration for Hyper_V around the corner as well ?

I belief acceleration is NOT a solution if you need to duplicate incrementals to tape correct ?

 

 

0
Login to vote
AbdulRasheed's picture

Your policy type can be Hyper-V (recommended) or FlashBackup, in either case you are getting the fastest image based backup for Hyper-V virtual machines. 

I am not allowed to share specifics on future releases, but I can say that next major release is packed with features for Hyper-V :) 

It is possible to create incremental backups even when Accelerator is used. Any backup image can be duplicated to tape including Accelerator based backups. Once created, there is no functional differenence between an Accelerator backup and non-Accelerated backup. 

Unlike Accelerator for file systems, there is one difference in the way incremetals are created for virtual machines using NetBackup Accelerator for VMware. In order to make it possible to do all sorts of recovery (full, file, application and application item level recoveries) from any backup type, NetBackup does create a virtual full images even during the incremental backup on the deduplication storage. This is not a problem as more storage is not consumed when the image is sitting on deduplication storage. However, when you copy to tape these images will get rehydrated and hence may consume more tapes. 

 

Warm regards,

Abdul "Rasheed" Rasheed

Tweet me @AbdulRasheed127

0
Login to vote
Morten Seeberg's picture

Great post Abdul, few questions:

1. Just to make sure I really understand this. When running NBD backups, the deleted blocks, swap space and white-space, are never transferred to the backup host (i.e. you are somehow able to identify these blocks before transmitting them, and then tell the ESX to never ship those blocks), or do you transfer the entire VMDK to the backup host and then the backup host filters them out?

With SAN backup, same question, do you read all blocks or just the needed ones?

2. Same questions is the same when it comes to "exclude data disk" and "exclude system disk" feature, at which phase is data excluded, at the backup host level or never sent from the ESX/read from SAN?

Did you restore something today?

0
Login to vote
AbdulRasheed's picture

Irrespective of the transport type, this is what NetBackup does for a given virtual machine.

  • Open all VMDK files for the given VM, resolve the volume layout
  • Read file system metadata using V-Ray/VxMS to figure out what can be excluded (whitespace, swap space, deleted files etc.) 
  • Now read the blocks really needed for the backup

In other words, NetBackup NEVER sends anything that is not needed to be part of the optimized backup image. This optimization is applicable for all transport methods. This is true whether you are backing up the entire virtual machine or a part of it using advanced options like "exclude system disk". 

 

Warm regards,

Abdul "Rasheed" Rasheed

Tweet me @AbdulRasheed127

+1
Login to vote
Morten Seeberg's picture

AHA, great. That also explains why VM backups of large file servers often seem to "hang a bit" during the first 5% of the backup, because it´s processing the files.

I see now also why it´s quite difficult to implement a more "specific" exclude list than boot or data disk, because that would mean NBU would have to map every single file to a block, for exclude to work.

Quick comment: I think its a bit confusing that the activity monitor counts KB up to the full size of the VM, whenever an incremental backup is run. It should only show the actual amount of data transferred (which it knows of course, its in the details under optimization info line).

Did you restore something today?

0
Login to vote
AbdulRasheed's picture

You are correct on all the three observations. Regarding the Accelerated incremental backup, the reality is that NetBackup is creating a full virtual image on the backend. It is designed this way so that you could do all kinds of recoveries (full, file, application and application item level recoveries) from ANY backup. In fact, you can also do an instant recovery from an incremental backup! That was the reason we show the actual 'protected bytes' and then show the optimization to convey that we didn't move all the data. 

Let me know if you still feel that we should only show the 'incremental bytes'. From SLA perspective, do you prefer to see how much NetBackup protected or transferred? We have been getting mixed responses from customer interviews. 

I am on PTO. My responses may be delayed. 

Warm regards,

Abdul "Rasheed" Rasheed

Tweet me @AbdulRasheed127

0
Login to vote
Morten Seeberg's picture

Well the "Kilobytes" column has always shown "KB transferred", so suddenly changing the meaning of the column for just one type of backup doesn´t make sense to me. I consider the "Kilobytes" column as the actualy amount of data transferred. There are so many other ways to calculate protected (nbdeployutil, OpsCenter, Activity Monitor and looking at full backups, Report-Client backups).
That being said, a "protected" column could be useful.....I think.

Maybe you know something I dont :-) but even though NBU is piecing together a full image every time on a block level, VMware Application restores (SQL, Exchange GRT) is still not possible from incremental backups :-)

I also saw your other comment about incrementals actually being full when duplicated, that´s important new knowledge as many customers are still duplicating to tape, but luckily very rarely incrementals.

Did you restore something today?

0
Login to vote
AbdulRasheed's picture

Let me get back to you on this, Morten. Just catching up on comments. Sorry about the delay.

Warm regards,

Abdul "Rasheed" Rasheed

Tweet me @AbdulRasheed127

0
Login to vote
Morten Seeberg's picture

So the official statement is there:
http://www.symantec.com/business/support/index?pag...
I checked with bpimage and the incremental image size is registered as the "full backup size".

So I guess this is a choice between "2 evils" :-)

1.
You want to know how much data will be transferred if you duplicate your Accelerator image using non-optimized duplication.
NetBackup reports as the full image size, i.e. you know how much space it will take up on tape when duplicated.

2.
You are interested in knowing how much data was actually transferred (because this is some times useful for troubleshooting.
Example, a clients incremental backups suddenly start backing A LOT of data every time they run incremental, how to identify when all backups are "sized = full". I guess the "Accelerator Optimization" could be used for this as well, but that only works as long as the job is in the Job DB.
Another example, I just ran into an issue at a client site where they billed customers based on the amount of transferred data, and suddenly incremental backups were full sized => incorrect numbers.

 

Personally I think the confusion happens because (some) people always considered the "Kilobytes" column in Activity Monitor as Kilobytes transferred, and not the "image" size (two things which until Accelerator came along basically was the same thing). I tried finding documentation/definitions for the different columns in the Activity Monitor, but I think some of these colums have never been documented, because before Accelerator, Kilobytes was just.... Kilobytes :-)
(Closest I got was the "COLDEFS entries" definition in the Admin Guide 1).

I guess the only solution to both problems is to introduce another field to the image DB for either "Optimized/Virtual Image size" or "Transferred KB".... or maybe do nothing and I am sure it all works out in the end :-)

Did you restore something today?

0
Login to vote
AbdulRasheed's picture

Thank you for digging this up. Sorry about not being able to stay on top of comments. You are absolutely right. NetBackup Accelerator changed the meaning (to be precise it happened a bit earlier, with virtual synthetics in  6.5.4 but it didn't affect the original backups from the client and hence no confusion) 

Warm regards,

Abdul "Rasheed" Rasheed

Tweet me @AbdulRasheed127

+1
Login to vote
Morten Seeberg's picture

No worries, I am aware your dayjob is not responding to connect posts :-) we value your insights and posts here.

Did you restore something today?

0
Login to vote