Video Screencast Help
Scheduled Maintenance: Symantec Connect is scheduled to be down Saturday, April 19 from 10am to 2pm Pacific Standard Time (GMT: 5pm to 9pm) for server migration and upgrade.
Please accept our apologies in advance for any inconvenience this might cause.

Nuts and Bolts in NetBackup for VMware: Handling Orphaned Snapshots

Created: 27 Oct 2011 • Updated: 22 Jan 2013 • 20 comments
AbdulRasheed's picture
+4 4 Votes
Login to vote

Recently, one of our customers asked me if NetBackup for VMware supports the use of a dedicated data store for snapshots. That triggered this blog.

  Snapshot is great. Among many of its uses, NetBackup employs it to create a consistent point-in-time image of the virtual machine for the purpose of backup. When a snapshot is active, the writes to VMDK files are directed redo logs. At the end of the backup, the snapshot is released and redo log is played back into VMDK.

  The world is less than ideal. What happens if a backup ends prematurely and the snapshot is left behind? Now the redo log grows. What if such situations arise frequently? Now you have multiple redo logs growing in the data store. There are two major issues here.

  1. The storage space on data store gets used up quickly, if the data store fills up all the VMs using that data store would be affected
  2. The more snapshots you have for the same virtual machine, the worse will be the VM performance

There is a tendency to mitigate issue 1 by dedicating a separate data store just for storing snapshots. However the solution has shortcomings

  • That solution does not address issue 2 listed above, the VM performance degrades if more and more orphaned snapshots stay in the data store.
  • You just created a bottle neck in the infrastructure. Now the number of VMs you can concurrently backup is limited by the I/O performance of the snapshot data store
  • There is some cost involved in creating and managing this snapshot LUN. You would need to monitor this LUN for orphaned snapshots and clean it up manually.

  How does NetBackup for VMware mitigate this nightmare? It is a simple, yet powerful strategy. It gives you the flexibility to control those orphaned snapshots. It is documented in NetBackup for VMware System Administrator’s guide with the title Existing snapshot handling

  NetBackup for VMware empowers you to decide what needs to be done if an existing snapshot (orphaned snapshot) is found for a VM. You can let NetBackup abort the backup job for that VM so that the problem is not further escalated. Or you can use the Remove NBU option which will automatically remove the orphaned snapshot if it was originally created by a NetBackup backup job.

  The strength of this approach is that you are resolving issue 1 and 2 listed above proactively.  Furthermore you eliminated all the shortcomings in the use case where dedicated a snapshot data store is used.

  Do you still have a business use case where you want to use a dedicated snapshot data store? If yes, please post it as an idea with the business use case. Let the community members vote. We are listening!

Back to Nuts and Bolts in NetBackup for VMware series

Comments 20 CommentsJump to latest comment

captain jack sparrow's picture

Hi Abdul,

 

It seems reading your series of netbackup blogs has became addiction to netbackup lovers. Requesting you to throw limelight on sharepoint/ Exchange backup process flow.. i searched across but couldn't get precise information unlikely yours (for VMware).

Thanx in advance.

 Cheers !!!

CJS

 

-1
Login to vote
AbdulRasheed's picture

Hi Vaibhav,

   I appreciate your feedback. Thanks much.

   Exchange/SharePoint deep dive is coming soon. I spoke to the product manager, he is happy to provide this insight. Stay tuned.

Warm regards,

Abdul "Rasheed" Rasheed

Tweet me @AbdulRasheed127

+1
Login to vote
Matt_Rademacher's picture

The problem here is that I have not seen the remove snapshot tab actually resolve most situations I run into. I have found many ghost snapshots left behind, where the VM is left running on the snapshot, but VCenter and Netbackup are unable to see an active snapshot. In this case, most VM backups fail the next run, and we have to remediate the individual VM's from the VMWare side. I will agree that things are alot better than they used to be, but I would not say the issues are resolved but thse settings in the policy. 

I do not see how a dedicated snapshot lun would help with any issue other than space.  As long as space utilizaion is controlled across the clusters, this should not be needed.   

+1
Login to vote
AbdulRasheed's picture

Hi Matt,

  In situations where vSphere client itself cannot see an active snapshot when logged into vCenter, NetBackup will have the same problem. Note that vADP uses a similar client framework to obtain snapshot information just like a traditional vSphere client. I also consulted our developers on your situation. They haven't seen anything like this recently in labs and are wondering if there is any environment specific issues we have not accounted for. If you continue to see this problem, could you please open a Support case?

  Thank you for the feedback.

Warm regards,

Abdul "Rasheed" Rasheed

Tweet me @AbdulRasheed127

-3
Login to vote
Scott G's picture

As Matt pointed out, space is a key issue.  And he clarified it with "as long as space utilitzation is controlled".  That's a very big "if"; it assumes a well maintained and monitored environment.  I work in an environment where they cannot stand to see any space not consumed, and run the datastores very close to full.  We have several servers that are many TB in size, which even with the best performance still take many hours to back up.  They are heavily used servers (file server), and the snapshots can grow very large in a short amount of time.  I have warned about leaving room for snapshot growth, but it is only thought about when datastores filled up and something breaks.

Having a separate place to grow a snapshot would alleviate THE major reason I cannot yet deploy VMware backups.  We started to switch over, and the moment we started running the backups, everyone flipped out because of the snapshots (despite the repeated explanations of how it would work).  So, at this point I cannot deploy without major changes to our datastores.  And we are dealing with 100s of datastores and 1000s of servers, so it's not a simple undertaking.

+1
Login to vote
AbdulRasheed's picture

Hi Scott,

   You bring up a valid point. We cannot always expect all storage administrators to do the 'right thing'. Do you mind posting this idea at https://www-secure.symantec.com/connect/backup-and...

   This will help us to guage how early such a feature is required in NetBackup for VMware. We are actively monitoring ideas portal for inputs. I will vote for your suggestion. 

  Thanks for the feedback.

 

Warm regards,

Abdul "Rasheed" Rasheed

Tweet me @AbdulRasheed127

+1
Login to vote
Scott G's picture

Absolutely.  I am reading through there now.  In the past, I've been sending our ideas in through our Rep.  I hadn't found that section of the forum before. I'll start using this forum now.  We have some other ideas to send up the chain as well.  :)

+1
Login to vote
Scott G's picture

Someone else already has the request; I voted for it, and I'll put my comments in as well:  https://www-secure.symantec.com/connect/ideas/way-redirect-netbackup-vmdk-snapshots-different-datastore

+2
Login to vote
Warrren Hulley's picture

 

@Abdul

Hi Abdul, we constantly have snapshots left behind in a couple of customer environments. Most of the time we are able to delete these manually, however more often we are unable to delete them and rely on the VMware team to either shut the machine down, or login to the ESX console to identify the task and kill it manually.

I currently have two cases open with Symantec support around snapshots being left behind. Does Symantec recognize that there are serious issues with the integration between VMware and Netbackup around snapshots?

 

Thanks

-3
Login to vote
AbdulRasheed's picture

Hi Warrren,

   You bring up a very valid point. If the snapshot deletion requires shutting down the VM itself, most likely the problem is more than just the API (VADP). Any snapshot that required a 'stun' cycle from VMware administrator is an indication that the VMkernel itself has issues. I would also recommend involving VMware support if this happens often. Do you mind sending me an e-mail with the case numbers you have with Symantec?

Warm regards,

Abdul "Rasheed" Rasheed

Tweet me @AbdulRasheed127

-1
Login to vote
victor.mds's picture

Hi Abdul,

we're experiencing the same issue in many environments i'm working with, the snapshots are not correctly deleted and the vmdk file stilll resides in the datastore even though the snapshot manager shows no snapshots. Although the activity monitor shows that the backups end succesfully, in the bpfis logs you can see that the snapshots deletion fails. I agree it could be a problem with vmware API, but as Netbackup is supposed to be integrated with it, what troubleshooting steps are recommended from symantec?

Anyhow, has anyone resolved this issue?

Thanks and regards

+3
Login to vote
AbdulRasheed's picture

Hi Victor, 

   We definitely owe you an explanation as to why snapshot deletion is failing after the backup. Bpfis log at high verbose level could give the clues. Do you have a support case on this? 

   Assuming that you have 'Remove NBU' snapshot attribute enabled (I am making this assumption as this blog was about it, correct me if that is not the case) do the snapshots get cleaned up during the next backup run? This operation will also seen in bpfis log associated with the next backup. 

   Also, do you see any snapshot delete activity reported in vSphere Client for the VM in question?

Warm regards,

Abdul "Rasheed" Rasheed

Tweet me @AbdulRasheed127

+1
Login to vote
victor.mds's picture

Hi Abdul,

We work with the option abort in the Existing Snapshot Handling, when we implemented this infraestructure the option of remove the snaphosts didn't work correctly in the tests we made because sometimes the not deleted snapshot appears as "consolidated_helper" instead of as it should be (starting with "NBU").

Moreover, as far as the snapshots in some cases are not showed in the snapshot manager, the API doesn't reflect any snapshot either. 

The option remove snapshot works fine when the "remove snapshot" command doesn't actually reach the virtual center, but when the command fails (i agree it's due to vmware problems) you are not aware through the activity monitor that has been a problem and the vmware snapshot was not correctly deleted.

We don't have any open case right now.

From my point of view, symantec would have to work tougher with vmware to control this circumstances.

Thanks

-3
Login to vote
AbdulRasheed's picture

Sorry, I didn't see this until now. I shall send your feedback to engineering team to see if this can be recreated. 

Warm regards,

Abdul "Rasheed" Rasheed

Tweet me @AbdulRasheed127

-1
Login to vote
qak's picture

Hi Abdul,

I have been working with a case on the same issue discussed in this forum. snapshots left behind from the backup grow to an extent that it fills up the data store. We have removeNBU option. But that is only tried next run (next day in our case) but by this time the damage has already been done. I am aware that it is issue with VMware not able to delete the snapshot. But it is good idea to get some sort error code in the activity monitor that there is orphaned snapshot exists although the backup has completed. is there anyway for NBU to try deleting the snapshot multiple times? instead of only one try? is there any overall developement in addressing this short coming from symantec or from vmware side?

Regards,

Karim

+1
Login to vote
AbdulRasheed's picture

Hi Karim, 

  I am checking on this. If I remember right, we do try 5 times before giving up. If we cannot delete on the fifth time, we attempt to clean up during the next backup. I do agree that we should propagate an error in snapshot handling. Let me get back to you tomorrow. 

 

Warm regards,

Abdul "Rasheed" Rasheed

Tweet me @AbdulRasheed127

-2
Login to vote
AbdulRasheed's picture

 

Hi Karim,

   I am confirming that NetBackup does try 5 times before giving up. As you would agree, if we cannot delete it even after 5 attempts, the chances are quite high that something is wrong at the data store or ESX level and vSphere administrator may need to intervene.

   I have some good news for you. You are correct in stating that NetBackup will report a status 0 if the backup is successful. This is the right approach as backup had indeed succeeded and is valid. For the snapshot deletion problem, NetBackup 7.5 will post an event in vSphere for the VM in question so that vSphere administrator will see this from vSphere client.  We want to do it this way so that corrective actions can be done immediately by the vSphere administrator. The vSphere administrators can define alarms based on these events.

   I cannot reveal the future roadmap, but I can tell you that there is something exciting coming up on this front in connection with VMware vSphere specific events related to NetBackup backups!

 

Warm regards,

 

Rasheed

Warm regards,

Abdul "Rasheed" Rasheed

Tweet me @AbdulRasheed127

-1
Login to vote
ChAmp35's picture

Hi
We are experiencing a problem with VM backups, in policy Existing snapshot handling is NBU rmove, but on few machines it still not deleting the snapshot and it is increasing the number of snapshots and all the times we have to delete these snapshots manually and its degrading the performance of Vm also creating athreat of crash of VM, Please advice on it.

Beagless's picture

Hi Abdul

is there any documentation on the improvements in 7.5 and vmware snapshots

 

thanks

P

+1
Login to vote
AbdulRasheed's picture

Hi P, 

    In addition to looking at NetBackup for VMware guide for 7.5, I would also recommend looking at NetBackup for Exchange/SQL Server/SharePoint guides if you would like to use the application protection and recovery from VM backups.

   Have you looked at this webcast? http://www.symantec.com/offer?a_id=137770 I did this a couple of weeks ago. The second half of this webcast is a live demo. Although the demo is for NetBackup Appliance, the features I discuss are also applicable to NetBackup 7.5.

  I do plan to post a blog or two on 7.5 based features when time permits. 

Warm regards,

Rasheed

 

Warm regards,

Abdul "Rasheed" Rasheed

Tweet me @AbdulRasheed127

+3
Login to vote