Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.

Oracle RMAN backups not writing logs to dbclient

Created: 09 Nov 2009 • Updated: 21 May 2010 | 21 comments
This issue has been solved. See solution.

I am troubleshooting a backup failure for one of our hot db backups.  The DBA has verified that the libraries have been linked correctly, and to make sure, he relinked them.  I've got VERBOSE = 5 in the bp.conf file.

The flat file backups to this host work without an issue, however the sparse details we get in the activity monitor suggest it is some type of connectivity issue (child streams fail 58 with the folloiwng message:

11/09/2009 14:50:40 - Error bprd (pid=9942) Unable to write progress log </usr/openv/netbackup/logs/user_ops/dbext/logs/16212.0.1257777824> on client <clientname-bkup>. Policy=NONE Sched=NONE
11/09/2009 14:50:40 - Error bprd (pid=9942) CLIENT <clientname-bkup>  POLICY NONE  SCHED NONE  EXIT STATUS 58 (can't connect to client)

Although it does not look like it from the error above, the necessary NetBackup parameters are being passed to Oracle, as I can see the master server name listed for "NB_ORA_SERV" and I can see the policy listed for "NB_ORA_POLICY" from the RMAN erro log.

The rman error stack also indicates the same type of connectivity issue.  I'm curious as to what could be preventing the logs to be written to the /usr/openv/netbackup/logs/dbclient directory on the host.  It has the same permissions set as the bpcd directory, which is having logs written to it (the permissions are 755).  I'm assuming that for some reason, NBU and Oracle are not communicating as they should, but I'm having difficulty nailing down where the disconnect might be.

RMAN log:

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ch01 channel at 11/09/2009 14:58:50
ORA-19506: failed to create sequential file, name="bk_96_1_702485024", parms=""
ORA-27028: skgfqcre: sbtbackup returned error
ORA-19511: Error received from media manager layer, error text:
   VxBSACreateObject: Failed with error:
   Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from the serve
 
RMAN>

Also, I've made sure the NB_ORA_SCHED parameter does not need to be passed to Oraclce, as I deleted one of the two schedules I had set up, so there is only one schedule that I am attempting to back up.

Any thoughts on the issue?

Comments 21 CommentsJump to latest comment

rjrumfelt's picture

So to add to it, I think there may be an issue with one of the SBT_TAPE channels in Oracle, however I have no clue as to how to track that down.

Andy Welburn's picture

look at the top of the post as opposed to the bottom :)

Just under the 'Forums' tab?

rjrumfelt's picture

I did not see that tab up there until you pointed it out

:)

Nicolai's picture

I think you get the "Unable to write progress log" message because Netbackup can't connect to the client´'s bpcd daemon (multiple network cards ?)

Try to run this rman script, it's boiled down to minimum. If it works, we know we have connectivity

rman nocatalog

connect target
run {
allocate channel ch00 type 'sbt_tape';
send 'NB_ORA_POLICY={YOUR_POLICY}, NB_ORA_SCHED=´{YOUR_SCHED}';
backup current controlfile;
release channel ch00;
}

But do verify Oracle has write permissions to to /usr/openv/netbackup/logs/user_ops/dbext

The security guys we have don't like 777 permissions so I always do a "chown root:dba dbext" and "chmod 775 dbext"

Assumption is the mother of all mess ups.

If this post answered your'e qustion -  Please mark as a soloution.

rjrumfelt's picture

setting the "REQUIRED_INTERFACE" variable in the bp.conf file to the backup interface of the client.  I'll pass your script on to the DBA and see what he can do with it.
 

Nicolai's picture

If your'e master/media server is on the same subnetwork as the Oracle host  REQUIRED_INTERFACE should do it, if not, you need to add a static route.

Also makre sure you in Netbackup and RMAN specify the DNS hostname of the NIC you intend to use.

Assumption is the mother of all mess ups.

If this post answered your'e qustion -  Please mark as a soloution.

rjrumfelt's picture

as client.

/usr/openv/netbackup/logs/user_openv/dbext has permissions set to 777

Could it be a firewall/routing issue even though I can connect and backup without issue when attemping a flat file backup of the system?

Marianne's picture

Does dbclient have 777 permission? Backups will fail if directory is present and oracle user does not have write access.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

Nicolai's picture

The way file system backup and Oracle backup works are very different.  Is it possible (as a test)  to unplum (un-configure) the secondary NIC to see if it changed anything.  If that's not possible you can use tcpdump to see the traffic flow.

Assumption is the mother of all mess ups.

If this post answered your'e qustion -  Please mark as a soloution.

Marianne's picture

I agree with Nicolai - have a look at bprd log file on the master to see IP address of client that's connecting to master to request backup. Then see if master can resolve that IP address back to the same hostname that is listed in the policy config. Also check reverse lookup on the media server for incoming IP address that you see in bprd log on the master.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

rjrumfelt's picture

bprd isnt logging.   I've got the directory there, and the verbosity on the master is cranked up to 5. 

rjrumfelt's picture

the secondary nic)

in the meantime, below is a snippet from dbclient, when things start heading downhill:

16:56:11.808 [22197] <4> readCommMessages: Entering readCommMessages
17:11:16.701 [22198] <16> readCommFile: ERR - timed out after 900 seconds while reading from /usr/openv/netbackup/logs/user_ops/dbext/logs/22198.0.1257785771
17:11:16.702 [22198] <32> serverResponse: ERR - could not read from comm file </usr/openv/netbackup/logs/user_ops/dbext/logs/22198.0.1257785771>
17:11:16.702 [22198] <16> CreateNewImage: ERR - serverResponse() failed
17:11:16.702 [22198] <4> closeApi: entering closeApi.
17:11:16.702 [22198] <4> closeApi: INF - EXIT STATUS 6: the backup failed to back up the requested files
17:11:16.702 [22198] <16> VxBSACreateObject: ERR - Could not create new image with file /bk_102_1_702492971.
17:11:16.702 [22198] <2> xbsa_ProcessError: INF - entering
17:11:16.702 [22198] <2> xbsa_ProcessError: INF - leaving
17:11:16.702 [22198] <16> xbsa_CreateObject: ERR - VxBSACreateObject: Failed with error: Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from the serve
17:11:16.702 [22198] <2> xbsa_CreateObject: INF - leaving (3)

but looking at /usr/openv/netbackup/logs/user_ops/dbext/logs, I've got permissions blown out (777):

****EDIT****

bpbrm and bpcd are also clean on the master server

Marianne's picture

This error sends me straight to bprd log on the master:
<16> CreateNewImage: ERR - serverResponse() failed
and
Communication with the server has not been initiated or the server status has not been retrieved from the server

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

rjrumfelt's picture

I'm not getting anything generated under bprd on the master.  The directory is blank.  Double checked and VERBOSE = 5 is still in the bp.conf file.

Marianne's picture

You need to stop/start NetBackup after creating bprd directory on master.
You can restart bprd as follows:
bprdreq -terminate ;initbprd

Only client-initiated actions will be affected during stop/start (not running backups).

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

SOLUTION
rjrumfelt's picture

So it is now working correctly.

The fix?  Bouncing bprd.  I bounced bprd per Marianne's suggestion above, in order to get my bprd logs running.  An unintended side affect was that bouncing bprd actually fixed the issue.  Not sure how, but it did - so I'm going to give Marianne the nod, and thank you Nicolai for your suggestions as well.

****EDIT****

Now does anyone know why bouncing bprd might have resolved the problem?

Marianne's picture

I expected bprd to point out lookup issue when master server tries to resolve incoming IP address from client.
The only logical explanation is that bprd might have been in some hung state.... I'm as stunned as you are!!

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

Nicolai's picture

Running LDAP authentication ?

If the LDAP server has been unresponsive bad things happens with Netbackup - Most issues point on network connectivity but when you bounce Netbackup all problems are gone.

Assumption is the mother of all mess ups.

If this post answered your'e qustion -  Please mark as a soloution.

Claudio Veronezi's picture

Have U used all topics to solve it?

networking was part of the problem?

I'm having the same problem.. and didn't work for me.

Thanks

Claudio Veronezi Mendes
IT Manager at Lb2 Consultoria
Londrina - Pr - Brazil
 

kfh's picture

There a "solution" flag for this thread in the forum, but the solution link jumped to restarting the netbackup daemons after creating the bprd log directory.  Has the issue of being unable to write logs to the /usr/openv/netbackup/logs/dbclient directory actually been solved?  I'm experiencing the same issue with a client's rman backups.

Marianne's picture

What are the permissions on dbclient log? The oracle user MUST have write access. Do 'chmod 777 dbclient'.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links