Video Screencast Help
Symantec Appoints Michael A. Brown CEO. Learn more.

Multiplexing with VSphere API Policy

Created: 21 Nov 2012 • Updated: 27 Nov 2012 | 17 comments
This issue has been solved. See solution.

I am having issues when using multiplexing to MSL drives with my VMWare backups, at present I am backing up approx 80 VM clients using the NBU VSphere API to my VLS, this means that all allocated drives are used and each individual clinet backup is using one single virtual cartridge, I wanted to move the VM backups onto out LTO5 MSL and use multiplexing to sream several clients onto a drive therefore using less tapes.

If I run the backups using media multiplexing set to 4 in the policy (we have 4 LTO5 drives in the MSL) and leave the maximum concurrent write drives to 4 on the Storage Unit the backups work fine with a single VM Client backing up through a single drive; hiowever, if I enable multiplexing and set the maximum streams per drive on the storage unit then the backups fail with an error 71, it does not matter how many stream I set on the storage unit this still fails the backups.

Just wondering if this is due to the policy being a VM VSphere API policy?

Thanks in advance

Kev

Comments 17 CommentsJump to latest comment

Mark_Solutions's picture

What version of NetBackup are you using?

There was a bug in 6.5.1 for a similar issue but not with a Status 71

If you are right up to date I would log a call with Symantec as it may be a new issue or they may have an EEB

What buffer numbers and sizes do you use? Just wondering if that may be causing things to go wrong

Could we also see a log - details staus of job and bpbrm / bptm logs from the media server

Thanks

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

Kevin Lamb's picture

Hi Mark,

I am using 7.5.0.4 on all my Master & Media Servers, not sure if this worked in previous 7.x versions as this is the first time I have tried this for the VM clients.

NUMBER_DATA_BUFFERS = 16

SIZE_DATA_BUFFERS = 262144

Will reset and capture the data from the bpbrm / bptm logs and post these in a mo

 

Kev

 

 

 

Kevin Lamb's picture

Detailed Status:

 

1/22/2012 10:16:24 - Info nbjm (pid=7086) starting backup job (jobid=72357) for client LONBFBCTX06, policy VMWare-System-VMDK, schedule Full-Monthly
11/22/2012 10:16:24 - estimated 17786078 kbytes needed
11/22/2012 10:16:24 - Info nbjm (pid=7086) started backup (backupid=LONBFBCTX06_1353579384) job for client LONBFBCTX06, policy VMWare-System-VMDK, schedule Full-Monthly on storage unit MSL-VMWARE
11/22/2012 10:16:24 - started process bpbrm (pid=7798)
11/22/2012 10:16:25 - Info bpbrm (pid=7798) starting bptm
11/22/2012 10:16:25 - Info bpbrm (pid=7798) Started media manager using bpcd successfully
11/22/2012 10:16:26 - Info bpbrm (pid=7798) LONBFBCTX06 is the host to backup data from
11/22/2012 10:16:26 - Info bpbrm (pid=7798) telling media manager to start backup on client
11/22/2012 10:16:26 - Info bptm (pid=7810) using 262144 data buffer size
11/22/2012 10:16:26 - Info bptm (pid=7810) using 16 data buffers
11/22/2012 10:16:26 - Info bptm (pid=7810) start backup
11/22/2012 10:16:26 - Info bptm (pid=7810) Waiting for mount of media id 2056L5 (copy 1) on server bfbackup.
11/22/2012 10:16:26 - Info bpbrm (pid=7798) spawning a brm child process
11/22/2012 10:16:26 - Info bpbrm (pid=7798) child pid: 7818
11/22/2012 10:16:27 - Info bpbrm (pid=7798) sending bpsched msg: CONNECTING TO CLIENT FOR LONBFBCTX06_1353579384
11/22/2012 10:16:27 - Info bpbrm (pid=7798) start bpbkar on client
11/22/2012 10:16:27 - mounting 2056L5
11/22/2012 10:16:27 - connecting
11/22/2012 10:16:27 - connected; connect time: 0:00:00
11/22/2012 10:16:30 - Info bpbkar (pid=2768) Backup started
11/22/2012 10:16:30 - Info bpbrm (pid=7798) Sending the file list to the client
11/22/2012 10:16:32 - Info bpbrm (pid=7818) from client LONBFBCTX06: TRV - object not found for file system backup: usr:\openv\netbackup\online_util\fi_cntl\bpfis.fim.LONBFBCTX06_1353579346.1.0.NBU_DATA.xml
11/22/2012 10:16:32 - Error bpbrm (pid=7818) could not send server status message
11/22/2012 10:16:32 - Critical bpbrm (pid=7818) unexpected termination of client LONBFBCTX06
11/22/2012 10:16:32 - Info bpbrm (pid=7798) sending message to media manager: STOP BACKUP LONBFBCTX06_1353579384
11/22/2012 10:16:33 - Info bpbrm (pid=7798) media manager for backup id LONBFBCTX06_1353579384 exited with status 150: termination requested by administrator
11/22/2012 10:16:33 - end writing
none of the files in the file list exist  (71)
 
I have attached the bpbrm and the bptm logs along with screen captures of the Storage Unit setup and the Policy Schedule setup that was ran
 
Kev
AttachmentSize
VMWare_Problem.docx 133.54 KB
bpbrm.txt 137.37 KB
bptm.txt 35.33 KB

Mark_Solutions's picture

Interesting .. looks like an app crash (anything in the system and application logs?)

But i also see this:

10:22:01.845 [8712] <16> catch_signal: media manager terminated by media mount timeout

and this:

10:22:01.846 [8712] <2> send_MDS_msg: Error from sendIrmMsg, Master bfbackup.ipcmedia.com, type 11, returned error 805

10:22:01.846 [8712] <2> send_MDS_msg: NBJM returned an extended error status: invalid jobid (805)

Wondering if your backup host cannot cope with doing a multiplexed backup of this type (or at least the 4 simultaneous streams) or if it does crash.

I think the timeout is more of a disconect than a timeout

Check the system and application logs on the vmware backup host.

I see it is a Win 2003 32 bit server, how much RAM does it have?

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

Kevin Lamb's picture

The file that the Detailed Status looks like it is failing on does exist in the /usr/openv/netbackup/online_util/fi_cntl

 

bpfis.fim.LONBFBCTX06_1353579346.1.0
bpfis.fim.LONBFBCTX06_1353579346.1.0.changeid.xml
bpfis.fim.LONBFBCTX06_1353579346.1.0.NBU_DATA.xml
bpfis.fim.LONBFBCTX06_1353579346.1.0.VM_ObjInfoXML.xml
 
however one this that is different is the slashes, as this is on a linux master server all the slashes are / and not \ dont know if that could eb an issue???
 
Kev

Mark_Solutions's picture

The file will have gone after the job failed and most likely the slashes are just a NBU way of writing things

Let me know what you find relating to my earlier post

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

Kevin Lamb's picture

The vmbackup host (lonbfbnbumedia1) is a Win2008 R2 with an Intel Xeon 3.07GHz processor and 24Gb RAM, not sure where you got the Win2003 from?? the MSL library I am trying to Multiplex to is attached to the RHEL Master Server which is a DL580 G7 with 120Gb RAM

 

Kev

Mark_Solutions's picture

OK - so vmbackup host is a client passing data to media server (master)

Anything in the system or application event logs for either?

One other thing from the VMWare admin guide is a note about a limit on the number of connections per vCenter being 27 (NFC)

This is on a per disk basis so if the four clients being backed up between them have more than 27 disks it could cause a hang and i guess as it is supposed to be multiplexed backup this could cause it all to fail

How many disk would you have on the 4 you are backing up?

May also need to see the bpfis log from the vmbackup host - it would usually be bpfis that crashes although you log suggests it is something else

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

Kevin Lamb's picture

Hi Mark,

Just checked the logs from the servers and I cannot see any errors at all, the 4 servers only have a C drive on them and we configure the VMBackup to discount datadisks, I have attached the bpfis from the vmbackup host.

Kev

AttachmentSize
112212.txt 96.47 KB

Mark_Solutions's picture

That looks OK - the answer i believe is in the disconnect between the Master and VMBackup Host when you use multiplexing but if there are no app crashes then i cannot quite put my finger on it.

Check all events at the time of the failure on the backup host and the Master (10:21am) in case anything is cropping up such as desktop heap, anti virus, access protection etc. etc. to try and pin this one down

On the point of anti virus is all of NetBackup excluded?

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

Kevin Lamb's picture

Thanks for taking a look Mark, what I might try doing is to zone in the MSL library directly to the VMBackup Host so it is not hopping through the Linux master server and see if that results in any change, will have to wait for our SAN guy to do this change for me tomorrow.

I have not excluded anything on the VM backups as I am not too sure how you do this as the option is not available on the policy and none of the VM clients have NBU installed as we use the VSphere API.

Going to trawl through the logs and see if anything springs to mind

 

Kev

Mark_Solutions's picture

Sorry - i meant anti viris exclusions on the vmware backup host incase AV was attacking the snapshots etc or blocking processes - something along the route kills a process somewhere

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

Kevin Lamb's picture

Ah I see now, we do use Symnatec Endpoint Protection so I will need to disable this and re-test it again.

 

Kevin Lamb's picture

Hi Mark,

Just disabled the SEP and re-ran the same backups using multiplexing and they fail with the same error, I will continue to try and trace anything in the logs and see if I can get a resolution.

Kevin Lamb's picture

I have checked through all the log files from the VM Backup Host and there is nothing out of the ordinary, I have tried the multiplexing on drives that are attached to the VM backup Host and still have the same issue, does anyone have Multiplexing working on VM backups using 7.5.0.4 using the VSphere API method of Full VM Mapped backups?

Mark_Solutions's picture

Just a thought .. do you have any VMNWare Resource limits set (Master Server Host Properties) - just wondering if they are conflicting .. i.e. job want to run 10 jobs, mpx wants to run 10 jobs, limit says you can only run 6 - so conflict makes it all fall over.

I do have one customer that does use mpx to tape with VMWare policies - as well as inline copy - and they work fine - they have made some recent changes so cannot guarantee they are VMWare policies and not Flashbackup-Windows policies though

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

Kevin Lamb's picture

Looks like I may have got it working now, I checked the configuration in the VMWare tab of the policy and changed the VMWare backup host from Backup Media Server to the FQDN of the backup host and this has allowed me to multiplex to the MSL attached on the Master Server......... 

SOLUTION