Video Screencast Help

VMware backups failing on Netbackup 7.5 (status code 6)

Created: 09 Aug 2012 • Updated: 10 Aug 2012 | 4 comments
This issue has been solved. See solution.

We are running Netbackup 7.5.0.3.  The backups to the vmware environment were running fine up until a couple days ago.

Now they are getting a status code 6 failure.  We're able to run a manual snapshot in vCenter.  Changing the transfter type from SAN to NBD works, but is obviously slow.  Leaving the transfer type on SAN gives us a failure.  DNS is working normal, we even added the vcenter, ESXi and media servers into all of the host files just to be sure.  No changes have been made to the zoning or vmware environment.  We've rebooted all the servers and did the normal troubleshooting, just a little stumped.

Here is the log from the most recent failure.

 

8/9/2012 11:21:24 AM - Info bpbrm(pid=2208) server1.nciwin.local is the host to backup data from     
8/9/2012 11:21:24 AM - Info bpbrm(pid=2208) reading file list from client        
8/9/2012 11:21:24 AM - Info bpbrm(pid=2208) starting bpbkar32 on client         
8/9/2012 11:21:24 AM - Info nbjm(pid=6620) starting backup job (jobid=108033) for client server1.nciwin.local, policy vmware_02, schedule VMware_Diff  
8/9/2012 11:21:24 AM - estimated 1515922 Kbytes needed
8/9/2012 11:21:24 AM - Info nbjm(pid=6620) started backup (backupid=server1.nciwin.local_1344529284) job for client server1.nciwin.local, policy vmware_02, schedule VMware_Diff on storage unit su_dd890
8/9/2012 11:21:25 AM - Info bpbkar32(pid=4020) Backup started           
8/9/2012 11:21:25 AM - Info bpbkar32(pid=4020) CONTINUE BACKUP received.          
8/9/2012 11:21:25 AM - Info bptm(pid=976) start            
8/9/2012 11:21:25 AM - Info bptm(pid=976) using 262144 data buffer size        
8/9/2012 11:21:25 AM - Info bptm(pid=976) setting receive network buffer to 1049600 bytes      
8/9/2012 11:21:25 AM - Info bptm(pid=976) using 30 data buffers         
8/9/2012 11:21:25 AM - started process bpbrm (2208)
8/9/2012 11:21:26 AM - connecting
8/9/2012 11:21:26 AM - connected; connect time: 00:00:00
8/9/2012 11:21:28 AM - Info bptm(pid=976) start backup           
8/9/2012 11:21:29 AM - Error bpbrm(pid=2208) from client server1.nciwin.local: ERR - Error opening the snapshot disks using given transport mode: Status 23
8/9/2012 11:21:30 AM - Info bpbkar32(pid=4020) bpbkar waited 0 times for empty buffer, delayed 0 times.   
8/9/2012 11:21:31 AM - begin writing
8/9/2012 11:21:46 AM - Error bpbrm(pid=2208) could not send server status message       
8/9/2012 11:21:46 AM - Critical bpbrm(pid=2208) unexpected termination of client server1.nciwin.local        
8/9/2012 11:21:48 AM - end writing; write time: 00:00:17
the backup failed to back up the requested files(6)

Comments 4 CommentsJump to latest comment

CRZ's picture

This probably won't help:

VMware Transport Modes: Best practices and troubleshooting
 http://symantec.com/docs/TECH183072

  • If you are using SAN, please make sure that the datastore LUNs are accessible to VMware Backup Host.

Do you happen to have VxMS logs around this failure?  Can you share them with us?

Enabling VxMS logging for NetBackup for VMware
 http://symantec.com/docs/HOWTO70905

We have A LOT of TechNotes around your issue in the knowledge base, but we'd need to look at your logs to narrow it down.


bit.ly/76LBN | APPLBN | 75LBN

policeman51's picture

Here are the recent logs.

Thanks for your help guys.

AttachmentSize
VxMS-3732-080912_provider.zip 88.39 KB
bpbkar.zip 86.89 KB
bpbrm.zip 58.08 KB
bpfis.zip 428.37 KB
CRZ's picture

From your VxMS log, I feel like this may be the key:

   17:10:12.0057 : g_vixInterfaceLogger:bvix.cpp:1833 <INFO> : diskLibPlugin: 2012-08-09T12:10:12.057-05:00 [05596 verbose 'Default'] Trying to open new LUN LVID:4fe0a334-ac230e9a-fd98-0025b5010a0f/4fe0a334-92374e9d-0c4a-0025b5010a0f/1.
-->

   17:10:12.0057 : g_vixInterfaceLogger:bvix.cpp:1833 <INFO> : diskLibPlugin: 2012-08-09T12:10:12.057-05:00 [05596 trivia 'Default'] Attempting to open LVID:4fe0a334-ac230e9a-fd98-0025b5010a0f/4fe0a334-92374e9d-0c4a-0025b5010a0f/1.

   17:10:12.0057 : g_vixInterfaceLogger:bvix.cpp:1833 <INFO> : diskLibPlugin: 2012-08-09T12:10:12.057-05:00 [05596 error 'Default'] No path to device LVID:4fe0a334-ac230e9a-fd98-0025b5010a0f/4fe0a334-92374e9d-0c4a-0025b5010a0f/1 found.

   17:10:12.0057 : g_vixInterfaceLogger:bvix.cpp:1833 <INFO> : diskLibPlugin: 2012-08-09T12:10:12.057-05:00 [05596 error 'Default'] Failed to open new LUN LVID:4fe0a334-ac230e9a-fd98-0025b5010a0f/4fe0a334-92374e9d-0c4a-0025b5010a0f/1.
-->

   17:10:12.0057 : g_vixInterfaceLogger:bvix.cpp:1833 <INFO> : diskLibPlugin: 2012-08-09T12:10:12.057-05:00 [05596 verbose 'Default'] Cannot access LUN LVID:4fe0a334-ac230e9a-fd98-0025b5010a0f/4fe0a334-92374e9d-0c4a-0025b5010a0f/1, cannot open disk snapshot-45270[CHI1HCR_OS_16A8] NCIHCDBS01/NCIHCDBS01.vmdk. Giving up.

 

...

 

   17:10:12.0057 : g_vixInterfaceLogger:bvix.cpp:1833 <INFO> : diskLibPlugin: 2012-08-09T12:10:12.057-05:00 [05596 error 'Default'] San transport error: Cannot access a SAN/iSCSI LUN backing this virtual disk..

   17:10:12.0057 : g_vixInterfaceLogger:bvix.cpp:1833 <INFO> : DISKLIB-LINK  : "san://snapshot-45270[CHI1HCR_OS_16A8] NCIHCDBS01/NCIHCDBS01.vmdk@chi1vmmgt01:443?nciwin\vbecentr/XXX" : failed to open (You do not have access rights to this file).  

   17:10:12.0057 : g_vixInterfaceLogger:bvix.cpp:1833 <INFO> : DISKLIB-CHAIN : "san://snapshot-45270[CHI1HCR_OS_16A8] NCIHCDBS01/NCIHCDBS01.vmdk@chi1vmmgt01:443?nciwin\vbecentr/XXX" : failed to open (You do not have access rights to this file).

   17:10:12.0057 : g_vixInterfaceLogger:bvix.cpp:1833 <INFO> : DISKLIB-LIB   : Failed to open 'san://snapshot-45270[CHI1HCR_OS_16A8] NCIHCDBS01/NCIHCDBS01.vmdk@chi1vmmgt01:443?nciwin\vbecentr/XXX' with flags 0x1e You do not have access rights to this file (3390).

My uneducated guess is that there must be some kind of path/mapping issue which is preventing the LUNs from being presented to the SAN, but I am an ignorant man speaking in very general terms.

Will this mean something to the SAN admin?  Can you recheck that the SAN is properly mounted on the backup host and that you can get to the LUNs?

(I'll throw this out, too, because it may mean nothing, but it was strange enough that I noticed it - I saw a lot of previous cases with this trouble but they were all 7.1 ones...I didn't see any 7.5 ones.)

You might want to call in a case on this, but your TSE might also diagnose this as SAN trouble and not be any more help than I would - but at least they would actually know what they're talking about when they say it.  ;-)

Good luck!


bit.ly/76LBN | APPLBN | 75LBN

SOLUTION
policeman51's picture

We spoke to our SAN admin and that was it.  The Media servers were not registering correctly with the SAN.  Once we removed the zoning and added it back, they registered correctly and were able to see the Datastores.

thanks.