Video Screencast Help

Netbackup restoration error 2826

Created: 17 Oct 2013 • Updated: 17 Oct 2013 | 23 comments

Hi Guys,

Here's my netbackup environment

Master server
pxbkpifgr01 - AIX 6.1 Netbackup version 7.1.0.2

Client
gesrv 2 - Windows 2003 32bit Netbackup version 6.5.3

---

I'm having restoration problem. Please see the below restoration log:

10/04/2013 10:03:58 - begin Restore
10/04/2013 10:04:43 - number of images required: 1
10/04/2013 10:04:43 - media needed: A00101
10/04/2013 10:04:49 - restoring from image gesrv2_1374836418
10/04/2013 10:04:50 - Info bpbrm (pid=5898492) gesrv2 is the host to restore to
10/04/2013 10:04:50 - Info bpbrm (pid=5898492) telling media manager to start restore on client
10/04/2013 10:04:51 - Info bpbrm (pid=5898492) spawning a brm child process
10/04/2013 10:04:51 - Info bpbrm (pid=5898492) child pid: 21233754
10/04/2013 10:04:51 - connecting
10/04/2013 10:04:52 - requesting resource A00101
10/04/2013 10:04:54 - Info bpbrm (pid=21233754) start tar on client
10/04/2013 10:04:56 - Info tar (pid=0) Restore started.
10/04/2013 10:04:56 - connected; connect time: 0:00:00
10/04/2013 10:05:04 - awaiting resource A00101. Waiting for resources.
          Reason: Drives are in use, Media server: pxbkpifgr01,
          Robot Type(Number): TLD(0), Media ID: N/A, Drive Name: N/A,
          Volume Pool: N/A, Storage Unit: N/A, Drive Scan Host: N/A,
          Disk Pool: N/A, Disk Volume: N/A
10/04/2013 10:05:42 - Waiting for scan drive stop Drive002, Media server: pxbkpifgr01
10/04/2013 10:05:43 - Info bptm (pid=14352616) Waiting for mount of media id A00101 (copy 2) on server pxbkpifgr01v.
10/04/2013 10:05:43 - started process bptm (pid=14352616)
10/04/2013 10:05:43 - mounting A00101
10/04/2013 10:05:43 - granted resource  A00101
10/04/2013 10:05:43 - granted resource  Drive002
10/04/2013 10:05:44 - Info bptm (pid=14352616) INF - Waiting for mount of media id A00101 on server pxbkpifgr01v for reading.
10/04/2013 10:06:17 - mounted A00101; mount time: 0:00:34
10/04/2013 10:06:17 - Info bptm (pid=14352616) A00101
10/04/2013 10:06:18 - Info bptm (pid=14352616) INF - Waiting for positioning of media id A00101 on server pxbkpifgr01v for reading.
10/04/2013 10:06:18 - positioning A00101 to file 6
10/04/2013 10:08:07 - positioned A00101; position time: 0:01:49
10/04/2013 10:08:07 - begin reading
10/04/2013 10:08:08 - Info bpbrm (pid=5898492) sending media manager msg: CONTINUE RESTORE
10/04/2013 10:08:44 - Info bpbrm (pid=5898492) sending media manager msg: CONTINUE RESTORE
10/04/2013 10:08:45 - Error bptm (pid=23199928) invalid tar header encountered, attempting to continue
10/04/2013 10:08:45 - Critical bptm (pid=23199928) expected file /F/Operations Division/Call Centre/, found file /F/Operations Division/Call Centre/CS Manual Epartner/Conversion Privileges.doc
10/04/2013 10:08:46 - Error bptm (pid=23199928) The following files/folders were not restored:
10/04/2013 10:08:46 - Error bptm (pid=23199928) UTF - /F/xxx/
10/04/2013 10:08:46 - Error bptm (pid=23199928) UTF - /F/xxx/xxx/xxx/
10/04/2013 10:08:47 - Error bptm (pid=23199928) UTF - /F/xxx/xxx/xxx/Minutes of Meeting/
10/04/2013 10:08:47 - Error bptm (pid=23199928) UTF - /F/xxx/xxx/xxx/Minutes of Meeting/2012/
10/04/2013 10:08:47 - Error bptm (pid=23199928) UTF - /F/xxx/xxx/xxx/xxxg/xxx/xxx.xls
10/04/2013 10:08:47 - Error bptm (pid=23199928) UTF - /F/xxx/xxx/xxx/xxx/2012/xxx.xls
10/04/2013 10:08:47 - Error bptm (pid=23199928) UTF - /F/xxx/xxx/xxx/xxx/2013/
10/04/2013 10:08:48 - Error bptm (pid=23199928) UTF - /F/xxx/xxx/xxx/xxx/2013/xxx.xls
10/04/2013 10:08:48 - Error bptm (pid=23199928) UTF - /F/xxx/xxx/xxx/xxx/2013/xxx.xls
10/04/2013 10:08:48 - Info bpbrm (pid=5898492) media manager for backup id gesrv2_1374836418 exited with status 174: media manager - system error occurred
10/04/2013 10:08:48 - Info bpbrm (pid=5898492) killing bpbrm child 21233754.
10/04/2013 10:08:48 - Info bpbrm (pid=5898492) sending media manager msg: TERMINATE
10/04/2013 10:08:48 - Error bpbrm (pid=21233754) bpcd on pxbkpifgr01 exited with status 21: socket open failed
10/04/2013 10:08:48 - restored from image gesrv2_1374836418; restore time: 0:03:59
10/04/2013 10:08:50 - Warning bprd (pid=26148988) Restore must be resumed prior to first image expiration on Fri Jul 24 19:00:18 2020
10/04/2013 10:08:51 - end Restore; elapsed time 0:04:53
Failed to get status code information (2826)

FYI already did image verification of the tape (A00101) and it completed without any errors

 

Appreciate your help on this

 

Thanks

 

 

 

 

Operating Systems:

Comments 23 CommentsJump to latest comment

Nagalla's picture

10/04/2013 10:08:45 - Error bptm (pid=23199928) invalid tar header encountered, attempting to continue
10/04/2013 10:08:45 - Critical bptm (pid=23199928) expected file /F/Operations Division/Call Centre/, found file /F/Operations Division/Call Centre/CS Manual Epartner/Conversion Privileges.doc

 

please run the verify (catalog verify-bpverify)on  A00101 tape and see  if it got successfull  or giving any errors.

azmie_ramly's picture

Hi Nagalla,

Already verify on the image itself inside the tape. No error encountered whatsoever

Marianne's picture

invalid tar header encountered,

Backup perhaps taken with encryption enabled?

If not, ensure these log folders exist:

On media server: bptm and bpbrm
Increase logging level for bptm to at least a 3 (Symantec support will want a 5). 

On client: tar

 

 

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

azmie_ramly's picture

I've already checked from the "client backup" from the console and the encryption is not enabled (is there another way to check for this?) If encryption were to be the issue what are we suppose to do in order to ensure restoration will be successful? 

Marianne's picture

I have not yet checked media server logs, but according to client's tar log, the restore was successful. It was the disconnect from media server that failed.

Have you checked the client to see if the file was indeed restored?

Any possibility that you can upgrade the client to the same NBU version as the master?
There is really no reason why client should be on this old, unsupported NBU version...

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

azmie_ramly's picture

Hi Marianne,

Have checked inside the client and there's no file being restored.

Will the upgrade of the client's netbackup version solve the issue? I will check with the person in charge of the server first to see if upgrade is possible.

azmie_ramly's picture

FYI I've also tried to restore to other client with the same netbackup agent version as the master server and still encountered the same error. So I'm guessing upgrading the current client will also produce same error result.

Marianne's picture

tar log clearly shows successful restore:

12:50:28.809 PM: [5484.5816] <2> tar_base::V_vTarMsgM: TAR - F:\Operations Division\
12:50:28.809 PM: [5484.5816] <4> tar_base::V_vTarMsgM: INF - tar message received from tar_restore::writeCheckpointInfo
12:50:28.809 PM: [5484.5816] <2> tar_base::V_vTarMsgM: CKP - 1098581.001 1381726228 0 0 0 1 0 U 23 /F/Operations Division/
12:50:28.809 PM: [5484.5816] <2> dtcp_write: TCP - success: send socket (1740), 68 of 68 bytes
12:50:28.887 PM: [5484.5816] <4> UnpackerTAR::getData(): INF - Complete data processed.
12:50:28.887 PM: [5484.5816] <2> tar_base::V_vTarMsgW: WRN - 'Server' service needs to be restarted for share to take effect: F:\Operations Division
12:50:28.887 PM: [5484.5816] <2> dtcp_write: TCP - success: send socket (1740), 94 of 94 bytes
12:50:28.903 PM: [5484.5816] <2> tar_base::backup_finish: TAR - restore:                     1 files
12:50:28.903 PM: [5484.5816] <2> tar_base::backup_finish: TAR - restore:          file data:       1172 bytes
12:50:28.903 PM: [5484.5816] <2> tar_base::backup_finish: TAR - restore:         image data:       6144 bytes
12:50:28.903 PM: [5484.5816] <2> tar_base::backup_finish: TAR - restore:       elapsed time:        168 secs           36 bps
12:50:28.903 PM: [5484.5816] <2> tar_base::V_vTarMsgW: INF - TAR EXITING WITH STATUS = 0
12:50:28.903 PM: [5484.5816] <2> dtcp_write: TCP - success: send socket (1740), 34 of 34 bytes
12:50:28.903 PM: [5484.5816] <2> tar_base::V_vTarMsgW: INF - TAR RESTORED 1 OF 1 FILES SUCCESSFULLY

 

The only error is the connection disconnect:

12:50:28.981 PM: [5484.5816] <16> dtcp_read: TCP - failure: recv socket (1748) (TCP 10058: Can't send after socket shutdown) 

I have not had a chance to look at media server logs.

If you should decide to log a call with Symantec Support, the client software will have to be on a supported version.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

MRM's picture

Please run a consistency check on the catalog (
http://www.symantec.com/business/support/index?pag...)

Upddate all clients to a supported version and retry the restore.

Marlon Rose-Mighty
426 894 2011

azmie_ramly's picture

Hi,

Will follow your advise to run a consistency check on the catalog first.

By the way, logged case to Symantec and they had claimed that my tape drive would be the problem. I am not sure from which ground they are based on but in the mean time I have also logged a case to the vendor which is IBM.

mph999's picture

I'm not sure NBCC is going to make any difference - in fact, it won't - a catalog incosistancy won't cause this error.

I also can't see how there is a tape drive error - can you give me the case number ?

Status 21 is socket related (can't open) -  cause could well be something on the OS.

 

From the details

 

10/04/2013 10:08:48 - Error bptm (pid=23199928) UTF - /F/xxx/xxx/xxx/xxx/2013/xxx.xls
10/04/2013 10:08:48 - Info bpbrm (pid=5898492) media manager for backup id gesrv2_1374836418 exited with status 174: media manager - system error occurred
10/04/2013 10:08:48 - Info bpbrm (pid=5898492) killing bpbrm child 21233754.

I don't find the PIDs in the bpbrm or bptm log.

bptm (pid=23199928) 

bpbrm (pid=5898492) 

Looking at the dates of the log file you posted up, that is 14th Oct, but the job ran on the 4th Oct - so you have the wrong log files.

Apart from bptm / bpbrm  logs, I would also recomend bprd from the master, from the date of the same day that the backup failed.

 

Regards,  Martin
 
Setting Logs in NetBackup:
http://www.symantec.com/docs/TECH75805
 
azmie_ramly's picture

Hi,

Case number is: 05236971

Actually that is only the log that I copied from the activity monitor console. I've made several restoration attempts after October 4th and all resulted to the same output as the ones in my first post

FYI this is not concerning backups. But restoration.

 

Thanks

 

Marianne's picture

I still find it strange that tar shows successful restore. See my post above:
https://www-secure.symantec.com/connect/forums/netbackup-restoration-error-2826#comment-9343061

Did Support not see that?

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

azmie_ramly's picture

This is what the support had replied:

==========================================

12:48:20.264 [30802112] <2> manage_drive_attributes: report_attr, fl1 0x00200049, fl2 0x0000000c
12:48:20.264 [30802112] <2> manage_drive_attributes: Matching medium manufacturer [FUJIFILM], sn [A96DC30PH4]
12:48:20.264 [30802112] <2> manage_drive_attributes: Host Attributes: Vendor [SYMANTEC], Name [NetBackup BPTM], Version [7.1.0.2]
12:48:20.264 [30802112] <2> manage_drive_attributes: DateLabeled [201307100358], Barcode [A00101L4], Owninghost [pxbkpifgr01v]
12:48:20.264 [30802112] <2> manage_drive_attributes: MediaPool [Adhoc_Pool], MediaLabel [MEDIA=A00101;]
12:48:20.264 [30802112] <2> manage_drive_attributes: Media is protected with APPEND_ONLY
12:48:20.278 [30802112] <2> io_open: SCSI RESERVE
12:48:20.278 [30802112] <2> io_open: file /usr/openv/netbackup/db/media/tpreq/drive_Drive051 successfully opened (mode 0)
12:48:20.279 [30802112] <2> io_read_media_header: drive index 48, reading media header, buflen = 65536, buff = 0x110c2f710, copy 2
12:48:20.279 [30802112] <2> io_ioctl: command (6)MTREW 1 0x0 from (bptm.c.8177) on drive index 48
12:48:20.292 [30802112] <2> io_ioctl: command (11)MTFSF 1 0x0 from (bptm.c.8429) on drive index 48
12:48:20.306 [30802112] <2> bptm media_id_to_monitor: job_id = 1098581
12:48:20.306 [30802112] <2> bptm media_id_to_monitor: pSrcMediaId = A00101
12:48:20.307 [30802112] <2> read_backup_mount_media: media id A00101 mounted on drive index 48

==========================================

Logs states that the media is protected with Append Only attribute. Due to this, i guess media is not allowing the netbackup to position the header on location from where that image starts. Thats why netbackup throws the error "invalid tar header encountered".

 

May you please check the same with your vendor to remove this attribute from media. Then might be able to perform the restore successfully.

 

Do let me know in case of any issues.

 

I've tried a different method of restoration whereby I restored file by file instead of the whole folders and directories and it actually worked and completed successfully! What I noticed from the activity monitor, is you have to make sure the file name itself appeared inside the file list being restored otherwise it will meet with same errors as the one I originally encountered. Pretty weird right? Is there any explanation on this?

 

 

Enrique Pereira Calvo's picture

It sounds strange for me. If you can verify the image you should be able to restore it.

Can you duplicate it to a disk stu? If it works, then you can try to make the disk image the primary one and try again from disk.

azmie_ramly's picture

Have not gone through that method yet. I will try it and will update to you

Marianne's picture

Please answer Martin's (mph999) questions below?

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

Balaji RT's picture

Can you post back the weird encounter u had, may be a snapshot

"What I noticed from the activity monitor, is you have to make sure the file name itself appeared inside the file list being restored otherwise it will meet with same errors as the one I originally encountered."

 

azmie_ramly's picture

Image as attached. I have to do that kind of approach to restore

restoration_ok.JPG restoration_ok2.JPG restoration_ok3.JPG restoration_ok4.JPG
mph999's picture

So let me confirm my uderstanding:

If you restore by selecting the directory -  it fails as you have explained.

As a matter of interest, is this any directory, top level directory etc ...

If you add the file names to a file <restore files> and restore by specifiying the <restore files> file - it works ?

Also, is this only affecting one tape, one client, one media server etc ... - I'm looking for a 'pattern' something 'common' to the issue.

If my understanding is correct this should really be escalated to invstigate further

As a matter of interest, instead of retoeing the folder from the GUI (I'm guessing you are) - what happens if you try from the command line, does it still fail.

 

 

Regards,  Martin
 
Setting Logs in NetBackup:
http://www.symantec.com/docs/TECH75805
 
azmie_ramly's picture

If you restore by selecting the directory -  it fails as you have explained.

YES

As a matter of interest, is this any directory, top level directory etc ...

ANY DIRECTORY

If you add the file names to a file <restore files> and restore by specifiying the <restore files> file - it works ? YES

Also, is this only affecting one tape, one client, one media server etc ... - I'm looking for a 'pattern' something 'common' to the issue.

I tried using a different image date for the same client and it worked fine. But I tried only 1 image though.

Ah45's picture

Eject the tape from tape library, then insert again and run robot inventory....with update option