Video Screencast Help
Symantec Appoints Michael A. Brown CEO. Learn more.

BMR savecfg failing with Status 26 after upgrading Master server to NetBackup 7.5.0.4

Created: 16 Oct 2012 • Updated: 27 Nov 2012 | 41 comments
This issue has been solved. See solution.

Hi we have upgraded most of our Master servers to Netbackup 7.5.0.4 and since then, the parent job fails with a status 1 (the internal error is Status 26).

We have a ticket open with Symantec but just checking to see if anyone else has seen anything similar.

Thanks!

AK

Here is a snippet of "Detailed Status"

10/15/2012 8:31:25 PM - begin Bare Metal Restore, Start Notify Script
10/15/2012 8:31:25 PM - Info RUNCMD(pid=6884) started            
10/15/2012 8:31:26 PM - Info RUNCMD(pid=6884) exiting with status: 0         
Status 0
10/15/2012 8:31:26 PM - end Bare Metal Restore, Start Notify Script; elapsed time: 00:00:01
10/15/2012 8:31:26 PM - begin Bare Metal Restore, BMR Save
10/15/2012 8:31:26 PM - started process bpbrm (8792)
10/15/2012 8:31:30 PM - collecting BMR information
10/15/2012 8:31:30 PM - connecting
10/15/2012 8:31:31 PM - connected; connect time: 00:00:01
10/15/2012 8:31:31 PM - transferring BMR information to the master server
10/15/2012 8:31:31 PM - connecting
10/15/2012 8:31:31 PM - connected; connect time: 00:00:00
Status 26

10/15/2012 8:31:36 PM - end Bare Metal Restore, BMR Save; elapsed time: 00:00:10
10/15/2012 8:31:36 PM - begin Bare Metal Restore, Policy Execution Manager Preprocessed
Status 0
10/15/2012 8:34:52 PM - end Bare Metal Restore, Policy Execution Manager Preprocessed; elapsed time: 00:03:16
10/15/2012 8:34:52 PM - begin Bare Metal Restore, End Notify Script
10/15/2012 8:34:53 PM - Info RUNCMD(pid=10040) started            
10/15/2012 8:34:53 PM - Info RUNCMD(pid=10040) exiting with status: 0         
Status 0
10/15/2012 8:34:53 PM - end Bare Metal Restore, End Notify Script; elapsed time: 00:00:01
Status 1
10/15/2012 8:34:53 PM - end Parent Job; elapsed time: 00:03:28
the requested operation was partially successful(1)

The job was successfully completed, but some files may have been
busy or unaccessible. See the problems report or the client's logs for more details.

Comments 41 CommentsJump to latest comment

Marianne's picture

Not sure if this has changed in recent versions (the last time I have worked closely with customer using BMR was in 6.5) but my experience has been that BMR master and clients had to be on the exact same NBU version for BMR backups to work.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

AKopel's picture

That's not the case, but either way, the failure happens regardless of the Client Version (we have tried upgrading all the clients as well...)

mandar_khanolkar's picture

I believe you have BMR master already setup on your NB master using "bmrsetupmaster" command. It is also fine if BMR master is setup before the upgrade.

Can you please enable debug level 6 log by setting debuglevel=6 in nblog.conf file on your nb master server? And restart bmr master service. Clear existing log in logs/bmrd folder and take bmr enabled backup.

Provide here the bmrd/*.log file generated.

thanks.

mandar

Marianne's picture

Easiest way to increase logging level without need for restarting anything:

vxlogcfg -a --prodid 51216 --orgid 119 -s DebugLevel=6 -s DiagnosticLevel=6

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

trv's picture

There is indeed something fishy with BMR in NBU 7.5.0.4. I have the same issue and I have found that bpbrm is in fact segfaulting during bmr data collection. Are you by any chance running NBU on rhel 5.8 too ?

Anyway, here is the backtrace:

Core was generated by `bpbrm -backup -collect_bmr_info -c SOMECLIENT -cl win_host_sr'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000003248a78900 in strlen () from /lib64/libc.so.6
(gdb) backtrace
#0  0x0000003248a78900 in strlen () from /lib64/libc.so.6
#1  0x0000003248a46e77 in vfprintf () from /lib64/libc.so.6
#2  0x0000003248a6875a in vsnprintf () from /lib64/libc.so.6
#3  0x00002adcea11d828 in V_vsnprintf () from /usr/openv/lib/libsts.so
#4  0x00002adcec4fde89 in vovgetmsg () from /usr/openv/lib/libnbbaseST.so
#5  0x00002adcec4fe184 in ovgetmsg () from /usr/openv/lib/libnbbaseST.so
#6  0x000000000048f11d in handleBmr ()
#7  0x000000000046dbd0 in main ()
 

AKopel's picture

Ahh!! Good catch! Our master is on Windows Server 2008 R2, but yes, looking at my app log, bpbrm.exe is core dumping here as well!

I'll escalate our case:

If you open a case as well, my case number is 600-868-553 to reference as likely the same issue.

AK

mandar_khanolkar's picture

Ohh. Did you escalated this to Symantec Support? This certainly looks like some bug to me.

thanks.

mandar

trv's picture

Nope - not yet, it's not critical for us as we can simply disable bmr collection or just ignore the error - the backup itself is working just fine. But I will do it later for sure.

mandar_khanolkar's picture

That would be great if you can raise a service ticket for this observed problem.

thanks.

mandar

Peter Jakobs's picture

Having the same problem after upgrading to 7.5.0.4, but only with clients running Windows 2003 as a virtual machine.

Master server is Solaris 10, media server Windows 2008 R2.

Upgrading the client to version 7.5.0.4 did not change anything.

 

Peter

 

marcelg's picture

Same problem here since installing 7.5.0.4 - SuSE Linux Master Server

bpbrm[24391] general protection ip:7f9c5af8c722 sp:7ffff244b618 error:0 in libc-2.11.1.so[7f9c5af0e000+155000]

Some W2K3 and W2K8 R2 clients fail, some succeed.

Dip's picture

I am planning to upgrade to 7.5.0.4 in a week or two in three NBU Domains. I am currently running 7.1.0.1. We do have BMR enabled in a large W2K3 and W2k8 Environment. Please let me know if you have a fix or workaround for this issue.

jim dalton's picture

Oh dear. BMR problems again it seems.

I've said it before and I'll say it again. BMR is not suitable for DR. Dont rely on it.

Do Symantec ever do any genuine organised systematic testing or is that left to the customer?

I'm on Sol10 master/media, I very recently upgraded to 7504...the reason I did this was ...you guessed it...to overcome issues wih previous versions of BMR. 

And I also see that my BMR phases are ending status 1.

Seriously hacked off customer.

 

mandar_khanolkar's picture

Hi Jim,

Please can you elaborate more on the issue you are facing.

Hi marcelg,

HP BL460c G7 + WinPE based recovery has some issues due to the HW and microsoft WinPE driver issues.

Can you explain on the problem you are facing? Is your NW is not coming up during recovery env?

Thanks.

Mandar

AKopel's picture

Still no progress yet. May have to escalate case to get some movement on it..

marcelg's picture

@mandar_khanolkar: Problem is pretty much the same as what everyone else is experiencing, i.e. backups incomplete, bpbrm -collect_bmr_info crashes.  Opened a case, but progress has been...slow

Dip's picture

I just upgraded my Test NBU environment from 7.1.0.1 to 7.5.0.4 and performed full backups of some of the OSs and they were successful. However, this environment is idel all the time as it is used for test purposes only so not sure if having load in backup environment will result differently for BMR data collection.

Below is what I have tested o far and all is well.

Windows 2003 32bit     VM Guest      NBU7.1 Client

Windows 2003 32bit     VM Guest      NBU7.1.0.1 Client

Windows 2008 R2 64bit     VM Guest      NBU7.1.0.1 Client

RedHat Enterprise 5.7     DL585 G5       NBU7.1.0.1 Client

jim dalton's picture

Heres what I'm seeing: Im caught between thinking its worked and its failed:

10/29/2012 21:00:00 - Info nbjm (pid=25197) starting backup job (jobid=1176056) for client w2kmango, policy PROD_W2K_Servers, schedule Daily_Inc
10/29/2012 21:00:00 - Info nbjm (pid=25197) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=1176056, request id:{9A5FB620-220B-11E2-8E3E-00144FC1D6A8})
10/29/2012 21:00:00 - requesting resource Master-wunebro-SL500-LTO4
10/29/2012 21:00:00 - requesting resource wunebro.NBU_CLIENT.MAXJOBS.w2kmango
10/29/2012 21:00:00 - requesting resource wunebro.NBU_POLICY.MAXJOBS.PROD_W2K_Servers
10/29/2012 21:00:03 - granted resource  wunebro.NBU_CLIENT.MAXJOBS.w2kmango
10/29/2012 21:00:03 - granted resource  wunebro.NBU_POLICY.MAXJOBS.PROD_W2K_Servers
10/29/2012 21:00:03 - granted resource  004631
10/29/2012 21:00:03 - granted resource  IBM.ULTRIUM-TD4.002
10/29/2012 21:00:03 - granted resource  Master-wunebro-SL500-LTO4
10/29/2012 21:00:10 - estimated 782486 kbytes needed
10/29/2012 21:00:10 - begin Parent Job
10/29/2012 21:00:10 - begin Bare Metal Restore: Start Notify Script
10/29/2012 21:00:10 - Info RUNCMD (pid=22664) started
10/29/2012 21:00:10 - Info RUNCMD (pid=22664) exiting with status: 0
Operation Status: 0
10/29/2012 21:00:10 - end Bare Metal Restore: Start Notify Script; elapsed time 0:00:00
10/29/2012 21:00:10 - begin Bare Metal Restore: Bare Metal Restore Save
10/29/2012 21:00:12 - started process bpbrm (pid=22672)
10/29/2012 21:00:16 - collecting BMR information
10/29/2012 21:00:16 - connecting
10/29/2012 21:00:16 - connected; connect time: 0:00:00
10/29/2012 21:00:16 - transfering BMR information to the master server
10/29/2012 21:00:16 - connecting
10/29/2012 21:00:16 - connected; connect time: 0:00:00
10/29/2012 21:00:58 - BMR information transfer successful
10/29/2012 21:00:58 - Info bmrsavecfg (pid=0) done. status: 0: the requested operation was successfully completed
10/29/2012 21:00:58 - end writing
Operation Status: 0
10/29/2012 21:00:58 - end Bare Metal Restore: Bare Metal Restore Save; elapsed time 0:00:48
10/29/2012 21:00:58 - begin Bare Metal Restore: Policy Execution Manager Preprocessed
Operation Status: 1

10/29/2012 21:25:57 - end Bare Metal Restore: Policy Execution Manager Preprocessed; elapsed time 0:24:59
10/29/2012 21:25:57 - begin Bare Metal Restore: End Notify Script
10/29/2012 21:25:57 - Info RUNCMD (pid=718) started
10/29/2012 21:25:57 - Info RUNCMD (pid=718) exiting with status: 0
Operation Status: 0
10/29/2012 21:25:57 - end Bare Metal Restore: End Notify Script; elapsed time 0:00:00
Operation Status: 1

10/29/2012 21:25:57 - end Parent Job; elapsed time 0:25:47
the requested operation was partially successful  (1)
 

The data backup itself ends with status 1; it has issues with some of the data on the server. I need to look deeper...

Dip's picture

Below copy paste from you above log is the BMR Config log upload. It was started at 21:00:10 and was completed at 21:00:58. this includes collecting configuration from client and sending it to Master server successfully. Then another Job kicked of which actually performed backup of your client, that job was parcially completed. If you check job detail on that second job you will see what was skipped.

Your Backup was successful as far as BMR is concern, the actual backup had skipped some files.

10/29/2012 21:00:10 - begin Bare Metal Restore: Bare Metal Restore Save
10/29/2012 21:00:12 - started process bpbrm (pid=22672)
10/29/2012 21:00:16 - collecting BMR information
10/29/2012 21:00:16 - connecting
10/29/2012 21:00:16 - connected; connect time: 0:00:00
10/29/2012 21:00:16 - transfering BMR information to the master server
10/29/2012 21:00:16 - connecting
10/29/2012 21:00:16 - connected; connect time: 0:00:00
10/29/2012 21:00:58 - BMR information transfer successful
10/29/2012 21:00:58 - Info bmrsavecfg (pid=0) done. status: 0: the requested operation was successfully completed

10/29/2012 21:00:58 - end writing
Operation Status: 0
10/29/2012 21:00:58 - end Bare Metal Restore: Bare Metal Restore Save; elapsed time 0:00:48

jim dalton's picture

Hello Dip

But....

10/29/2012 21:00:58 - begin Bare Metal Restore: Policy Execution Manager Preprocessed
Operation Status: 1

 

This is _before_ the data backup , so it can only refer to BMR phase?

The data backup didnt finish til 21:25, the status 1 above cant possibly refer to anything but the BMR step, dont you agree?

The messages are somewhat confusing:

10/29/2012 21:25:57 - end Bare Metal Restore: End Notify Script; elapsed time 0:00:00
Operation Status: 1

 

Why is it bringing up the subject of BMR after the data phase? As I understood it, the BMR phase is complete before the data phase. Searching for info regards BMR start/end notify info...!

Jim

Dip's picture

I missed that Status:1 after the BMR information transfered. You are correct, something did not work as far as BMR is concerned in this job.

mandar_khanolkar's picture

Looking at the job details provided by Jim, I do see that BMR configuration backup has been successfully executed. If BMR backup fails, "BMR information transfer successful" message should not come.

I hope you can see the client entry even under Bare metal restore -> clients menu in your admin GUI.

I suspect somthing is wrong with NB policy execution manager here showing this error.

Only debug logs can reveal the problem.

Thanks.

Mandar

jim dalton's picture

Mandar

Yes Ive check the status in the BMR clients and the timestamps suggest the BMR config information is indeed correctly transferred.

Ive raised a call with support , but support is painfully slow these days, getting information from support in a timely manner just doesnt happen, the forum is much more responsive.

Jim

jim dalton's picture

To go back to the original thread, I also have just four clients exhibiting AKopel's behaviour, ie error 26 on save so I'm interested in a fix for that. I'm so glad I upgraded to 7.5.0.4.

Jim

mandar_khanolkar's picture

Hi Jim

Is it possible for you to provide the support case number here?

Thanks.

Mandar

jim dalton's picture

I have received some feedback from support , we are making progress.

The error 26 issue seems to be related to the client and the server talking to eachother: the bundle is created ok and can be imported onto the master if done manually ie having copied it manually.

Error 1...nothing to tell as yet. 

Jim

Peter Jakobs's picture

Anybody has any news?

Another ten days have passed, support is disappeared. case number 420-213-118

I really doubt that my comapny is going to pay so much next year for 'support'

Peter

AKopel's picture

We FINALLY got our case escalated and they are collecting the application core dumps for bpbrm... hopefully will have some progress soon.

 

AK

marcelg's picture

Same here, i.e. finally got the case escalated.  What I find mystifying is why it took so long for support to acknowledge that it might be a bug.  In what world is an easily repeatable core dump not a bug?

dezzer's picture

We are currently experiencing the same issues under case 02818877. All with HP BL460c G7 servers.

Unable to do the BMR dumps until we raise a change for the registry edits.

Not the only ones out there with this issue.

Must have been some code change in the NetBackup software, as these backups worked fine before we upgraded to 7.5.0.4.

dezzer's picture

Been told to do this:

Please enable the registry key "LocalDumps" by following the instruction shown in:

http://msdn.microsoft.com/en-us/library/windows/de...

So we can get BMR process crash dumps being saved.

dezzer's picture

We need to set a registry keyto get the BMR dumps working:

From Symantec support:
Please enable the registry key "LocalDumps" by following the instruction shown in:
http://msdn.microsoft.com/en-us/library/windows/de...

We will then need to send them the logs for bpbrm.exe crashes.

Stavros41's picture

Same here

Just deployed NBU 7.5.0.4 at a customer and exactly the same problem.

Tested servers so far are all HP 460c G7 Servers

 

 

 

marcelg's picture

Hi,

Symantec support has provided an EEB (etrack 2991238) that resolves this issue.

SOLUTION
Rupesh Patel's picture

where can i download the EEB EEB (etrack 2991238) , i too face the same issue with BMR backups on 7.5.0.4.

CRZ's picture

You will have to open a Support case requesting this EEB - be ready to provide logs demonstrating you're hitting this issue.  Feel free to reference this thread since we don't have a TechNote prepared yet.


bit.ly/76LBN | APPLBN | 75LBN