Video Screencast Help

RMAN backup status code 41

Created: 01 Mar 2012 • Updated: 06 Mar 2012 | 18 comments
This issue has been solved. See solution.

Hi all,

I have Oracle 11.2 DB on AIX 6.1 with NetBackup 7.1 client, Windows 2008R2 NetBackup 7.1 as Master Server. I have not firewalls between client and server.

I have Oracle policy maked with wizzard, policy have 2 schedules Default-Application-Backup and Full.

I try to test manual RMAN backup running following script:

RUN {

 

2>  # Control file backup
3>  ALLOCATE CHANNEL ch00
4>      TYPE 'SBT_TAPE';
5>  SEND 'NB_ORA_CLIENT=vs-ora00-03a.hosting.local,NB_ORA_SID=dev,NB_ORA_POLICY=test,NB_ORA_SERV=ps-bcp00-01a.hosting.local,NB_ORA_SCHED=Default-Application-Backup';
6>  BACKUP
7>      FORMAT 'ctrl_u%u_s%s_p%p_t%t'
8>      CURRENT CONTROLFILE;
9>  RELEASE CHANNEL ch00;
10>  }
 
Job failed with messages
 
01.03.2012 12:46:45 - Info nbjm(pid=4940) starting backup job (jobid=136) for client vs-ora00-03a.hosting.local, policy test, schedule Default-Application-Backup  
01.03.2012 12:46:45 - Info nbjm(pid=4940) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=136, request id:{E7DD9B33-914E-4B82-9182-7AA86DF4ED00})  
01.03.2012 12:46:45 - requesting resource ps-bcp00-01a-local-disk-2tb
01.03.2012 12:46:45 - requesting resource ps-bcp00-01a.hosting.local.NBU_CLIENT.MAXJOBS.vs-ora00-03a.hosting.local
01.03.2012 12:46:45 - requesting resource ps-bcp00-01a.hosting.local.NBU_POLICY.MAXJOBS.test
01.03.2012 12:46:45 - granted resource ps-bcp00-01a.hosting.local.NBU_CLIENT.MAXJOBS.vs-ora00-03a.hosting.local
01.03.2012 12:46:45 - granted resource ps-bcp00-01a.hosting.local.NBU_POLICY.MAXJOBS.test
01.03.2012 12:46:45 - granted resource MediaID=@aaaab;DiskVolume=D:\;DiskPool=Local_Storage_2TB;Path=D:\;StorageServer=ps-bcp00-01a.hosting.local;MediaServer=ps-bcp00-01a.hosting.local
01.03.2012 12:46:45 - granted resource ps-bcp00-01a-local-disk-2tb
01.03.2012 12:46:45 - estimated 0 Kbytes needed
01.03.2012 12:46:45 - Info nbjm(pid=4940) started backup job for client vs-ora00-03a.hosting.local, policy test, schedule Default-Application-Backup on storage unit ps-bcp00-01a-local-disk-2tb
01.03.2012 12:46:45 - started process bpbrm (5676)
01.03.2012 12:46:46 - Info bpbrm(pid=5676) vs-ora00-03a.hosting.local is the host to backup data from     
01.03.2012 12:46:46 - Info bpbrm(pid=5676) reading file list from client        
01.03.2012 12:46:46 - connecting
01.03.2012 12:46:48 - Info bpbrm(pid=5676) listening for client connection         
01.03.2012 12:46:54 - Info bpbrm(pid=5676) INF - Client read timeout = 300      
01.03.2012 12:46:54 - Info bpbrm(pid=5676) accepted connection from client         
01.03.2012 12:46:54 - Info bphdb(pid=18219170) Backup started           
01.03.2012 12:46:54 - Info bptm(pid=5624) start            
01.03.2012 12:46:54 - Info bptm(pid=5624) using 262144 data buffer size        
01.03.2012 12:46:54 - Info bptm(pid=5624) setting receive network buffer to 1049600 bytes      
01.03.2012 12:46:54 - Info bptm(pid=5624) using 30 data buffers         
01.03.2012 12:46:54 - connected; connect time: 00:00:08
01.03.2012 12:46:55 - Info bptm(pid=5624) start backup           
01.03.2012 12:46:55 - Info bptm(pid=5624) backup child process is pid 2232.6728       
01.03.2012 12:46:55 - Info bptm(pid=2232) start            
01.03.2012 12:46:55 - begin writing
01.03.2012 12:52:00 - end writing; write time: 00:05:05
01.03.2012 12:52:05 - Info bphdb(pid=18219170) done. status: 41: network connection timed out      
network connection timed out(41)
 
Script output:
using target database control file instead of recovery catalog
allocated channel: ch00
channel ch00: SID=224 device type=SBT_TAPE
channel ch00: Veritas NetBackup for Oracle - Release 7.1 (2011020322)
 
sent command to channel: ch00
 
Starting backup at 01-MAR-12
channel ch00: starting full datafile backup set
channel ch00: specifying datafile(s) in backup set
including current control file in backup set
channel ch00: starting piece 1 at 01-MAR-12
released channel: ch00
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ch00 channel at 03/01/2012 13:02:05
ORA-19506: failed to create sequential file, name="ctrl_u26n4pg5f_s70_p1_t776781999", parms=""
ORA-27028: skgfqcre: sbtbackup returned error
ORA-19511: Error received from media manager layer, error text:
   VxBSACreateObject: Failed with error:
   Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from the serve
 
Recovery Manager complete.
 
 
 

Comments 18 CommentsJump to latest comment

Mark_Solutions's picture

This times out after just a little over 5 minutes

Increase the Client Read Timeout Setting on your Media Servers to see it it resolves it for you

Use 1800 or 3600 to mae sure it has time to do its pre-processing

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

Alex_al's picture

I read Symantec technotes and tried increase timeout, but nothing changes. 

When I make test rman backup on local disk operation takes a second.

 

RMAN> RUN {
 # Control file backup
 ALLOCATE CHANNEL ch00 TYPE DISK;
 BACKUP FORMAT '/u01/app/oracle/ctrl_u%u_s%s_p%p_t%t'
 CURRENT CONTROLFILE;
 RELEASE CHANNEL ch00;
 }
2> 3> 4> 5> 6> 7>
using target database control file instead of recovery catalog
allocated channel: ch00
channel ch00: SID=99 device type=DISK
 
Starting backup at 01-MAR-12
channel ch00: starting full datafile backup set
channel ch00: specifying datafile(s) in backup set
including current control file in backup set
channel ch00: starting piece 1 at 01-MAR-12
channel ch00: finished piece 1 at 01-MAR-12
piece handle=/u01/app/oracle/ctrl_u2dn4pq0u_s77_p1_t776792094 tag=TAG20120301T153454 comment=NONE
channel ch00: backup set complete, elapsed time: 00:00:01
Finished backup at 01-MAR-12
 
released channel: ch00
Marianne's picture

We will need server logs to see if there are any communication problems between Oracle client and master as well as media server (if they are different).

On master: bprd (NBU needs to be restarted after folder is created)

On media server: bptm and bpbrm (no need to restart)

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

Alex_al's picture

Logs from server attached

AttachmentSize
bpbrm_log.txt.txt 12.22 KB
bprd_log.txt.txt 1.28 KB
bptm_log.txt.txt 145.48 KB
Mark_Solutions's picture

The log shows the following:

bpbrm timeout after 300 seconds

You need to go to the host properties section in the admin console and to the Media Servers section

Connect to a media server - go to the timeouts tab and set the Client Read Timeout to 1800

Repeat for all Media Servers

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

Marianne's picture

I agree with Mark. Although local backup to disk is quite fast, there are a lot more processes involved when backing up to NBU.

There does not seems to be any comms failure, just genuine timeout. Symantec recommends to increase Client Connect as well as Client Read timeouts to 1800 for database backups. The timeouts need to be changed on the media server.

There seems to be a 900 sec timeout on the client as well - please check bp.conf in netbackup folder as well as bp.conf in oracle_user $HOME/bp.conf (if it exists). Timeouts on server and client should match.

BTW - bprd log file is basically empty - required info will only be logged to bprd on master once NBU is restarted after log folder is created.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

Alex_al's picture

I change timeout value to 1800 on sevrer and client side but get same error.

 

AttachmentSize
bpbrm_log.txt.txt 57.54 KB
bprd_log.txt.txt 145.63 KB
bptm_log.txt.txt 374.41 KB
Mark_Solutions's picture

Ok - well nwo you have: bpbrm readline: bpbrm timeout after 1800 seconds

So still not enough to do what it has to do - try 3600

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

Alex_al's picture

I try increase timeout but result is not changed.

AttachmentSize
bpbrm_log.txt.txt 57.54 KB
bprd_log.txt.txt 57.54 KB
bptm_log.txt.txt 1.63 MB
Marianne's picture

Are any other backups to this STU successful?

Seems bptm never received any data for this job. Last entries in bptm for this backup:

 15:12:09.024 [4480.4976] <4> write_backup: begin writing backup id vs-ora00-03a_1330686717, copy 1, fragment 1, destination path D:\
15:12:09.024 [4480.4976] <2> signal_parent: set bpbrm media ready event (pid = 2296)
15:12:09.026 [4480.4976] <2> write_data: twin_index: 0 active: 1 dont_process: 0 wrote_backup_hdr: 0 finished_buff: 0 saved_cindex: -1 twin_is_disk 1 delay_brm: 0
15:12:09.026 [4480.4976] <2> write_data: Total Kbytes transferred 0
15:12:09.026 [4480.4976] <2> ndmp_setup_for_write: CINDEX 0, TWIN_INDEX 0, IS_NDMP 0, is_tir 0 

bpbrm simply times out 30 minutes later.

What is dbclient and user_ops log on client saying at this point in time?

/usr/openv/netbackup/logs/user_ops/dbext/logs/24772856.0.1330686712

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

Alex_al's picture

I create the test file backup policy on same STU at thursday and it worked fine, but today (5-mar-2012) it fail with status: 05.03.2012 11:31:44 - Error nbjm(pid=5116) NBU status: 219, EMM status: Storage unit is not defined in EMM 

the required storage unit is unavailable(219)
 
Content of /usr/openv/netbackup/logs/user_ops/dbext/logs/24772856.0.1330686712
 
15:11:54 Initiating backup
15:12:03 INF - Data socket = ps-bcp00-01a.hosting.local.IPC:/tmp/vnet-41538330686721529053000000001--Zaaib;6adfae188e7102b49a695c2b48eced01;10;300
15:12:04 INF - Name socket = ps-bcp00-01a.hosting.local.IPC:/tmp/vnet-69216330686721929190000000001--Zaaab;74f4ab12cbd2b992a46aec5a62813b4c;10;300
15:12:04 INF - Job id = 161
15:12:04 INF - Backup id = vs-ora00-03a_1330686717
15:12:04 INF - Backup time = 1330686717
15:12:04 INF - Policy name = test
15:12:05 INF - Snapshot = 0
15:12:05 INF - Frozen image = 0
15:12:05 INF - Backup copy = 0
15:12:05 INF - Master server = ps-bcp00-01a.hosting.local
15:12:05 INF - Media server = ps-bcp00-01a.hosting.local
15:12:06 INF - Multiplexing = 0
15:12:06 INF - New data socket = ps-bcp00-01a.hosting.local.IPC:/tmp/vnet-03680330686721129002000000001-ZZaaab;23c2add4c63b52ede5b6a6a25528a0f9;10;300
15:12:06 INF - Encrypt = 0
15:12:06 INF - Use shared memory = 0
15:12:06 INF - Compression = 0
15:12:07 INF - Encrypt = 0
15:12:07 INF - Keep logs = 28
15:12:07 INF - Client read timeout = 1800
15:12:07 INF - Media mount timeout = 0
 
 
LutzHeinrich's picture

did you configure your application start windows for 24/7, so the skript can be executed at any time?

Alex_al's picture

Yes, both my policies configured 24/7.

gilbert08's picture

check the permission use to initiate the backup.

Alex_al's picture

Wich permission I should check?

On client side I run script as user with DBA rights.

gilbert08's picture

If you could give us a copy of the script could be better thanks

Alex_al's picture

 

 
My test backup script:
RUN {
 # Control file backup
 ALLOCATE CHANNEL ch00
     TYPE 'SBT_TAPE';
 SEND 'NB_ORA_CLIENT=vs-ora00-03a.hosting.local,NB_ORA_SID=dev,NB_ORA_POLICY=test,NB_ORA_SERV=ps-bcp00-01a.hosting.local,NB_ORA_SCHED=Default-Application-Bac
kup';
 BACKUP
     FORMAT 'ctrl_u%u_s%s_p%p_t%t'
     CURRENT CONTROLFILE;
 RELEASE CHANNEL ch00;
 }
Alex_al's picture

Today we make upgrade to 7.5 version and now all works fine!

Thank you all for advice!

SOLUTION