Video Screencast Help

Media Freezing

Created: 26 May 2013 • Updated: 24 Jun 2013 | 10 comments
This issue has been solved. See solution.

Hi Guys,

Recently i have encountered a problem while backing up of data.

The following error was observed in the detailed status of the job:

 

Error bptm (pid:3876) FREEZING media id <media id>, Externl event caused rewind during write, all data on the media is lost

 

Could someone pls explain me the reason for the below error, and will the previous data is all lost and if so how can i recover it.

 

thanks

Operating Systems:

Comments 10 CommentsJump to latest comment

huanglao2002's picture

Externl event caused rewind during write?

Can you try to list all the server access the tape library?

If some server have zone to the tape library, but do not install backup software.Please remove the server from the zone.

This issue is (potentially) serious and requires immediate investigation, as data can be lost. NetBackup will display this error if the block position calculation check by NetBackup does not match the position reported by the drive. It will not be certain that a full rewind has occurred (impossible to tell from a simple blockcheck), but it will mean that the position check has failed, and most likely that the calculated position is less than the expected position.

 

detail info ,please reference     http://www.symantec.com/docs/TECH169477

 

 

sriharishkandula's picture

That was the error which i have received in the detailed status of the job.

Could you please let me know hw to check the server access to the tape library and also how to check whether the server is zoned to the tape library or not.

Nicolai's picture

This may be a SAN zoning issues. When zoning is performed in the SAN, the HBA should be zoned to each drive using a separate zone. Do not put the HBA and all the tape drives in the same zone.

If HBA and drives are all in one zone, SCSI bus reset, drive reset etc will propagate to all the other drives. If you deploy a one HBA to one drive zone strategy such errors will be contained in the zone.

I would shelf the media and wait for the data to expires, and then unfreeze the tape to put it back into rotation.

Assumption is the mother of all mess ups.

If this post answered your'e qustion -  Please mark as a soloution.

Ankit Maheshwari's picture

sriharishkandula-  please suggest

Whether this is NDMP backup or not?

NB version-

Os-

is this VTL?

 

 

This is serious problem...Many times it is caused due to H/w, firmware issue.

1. Need to ensure proper tape drive drivers are installed & are latest version.

 

2. If the tape drives are san connected then ensure the store ports drivers (HBA) are updated to latest.

 

 

 

External event caused rewind - worst case is the media was rewound mid backup and the

Header is overwritten which is data lost, this is caused by multiple issues

all outside nbu. Common causes are if the media server drives are shared

With ndmp filer and the SCSI reservation type is set differently on the filer than in netbackup

If no ndmp filer than other possible causes include hba fault or firmware issue

or a san issue.

 

Best case is not data loss and there has been no rewind, but there has

been a positioning error so the backup probably failed.

 

 

 

 

 

 

Ankit Maheshwari

mph999's picture

If you look in the bptm log you swill see lines like this :

00:00:21.366 [10057] <2> io_terminate_tape: block position check: actual 1304, expected 1304
 
NBU knows how many blocks of data should have been written to the tape drive, it requests that the drive gives it position and they should match.  If they do not, it gives out the error you see, which can be mis-leading.
 
If there really was a scsi rewind, then something has sent this over the san, and the data will be lost as the drive did a rewind during the backup (invisible to NBU and the operating system) and then continued to write from the beginning of the tape which overwrote the tape header.  This is easy to check, 
 
bpmedialist -m <media id> -mcontents
 
If this mounts the tape there was no scsi rewind.
 
If there was a scsi rewind, two likely causes
 
1.  Something sent a scsi rewind  (difficult to find unfortunately)
2.  If you have ndmp devices sharing the drives with media servers, if the scsi reservation type set in NBU is different to that set on the ndmp devices, this issue can occur.
 
More likely (in my experience) is that there is just a position failure, in this case the cause will be either a tape driver issue or drive firmware issue.
 
Martin

 

Regards,  Martin
 
Setting Logs in NetBackup:
http://www.symantec.com/docs/TECH75805
 
sriharishkandula's picture

Hi Guys sorry for the late reply..
 

@Ankit: We are currently using 7.1.4 Netbackup version and a 2008 Server.

We dont have any VTL but we use Quantum i-scalar 600 library.

 

So Ankit you mean to say by installing the latest drive firmware and also HBA ports latest firmware the issue can be resolved right?

Also is there any way in which i can retrieve the lost data from thoses medias which have encountered this problem?

sriharishkandula's picture

Yeah the main important part is that they were NDMP backups

Marianne's picture

Please have another look at Martin's post (seems you totally ignored it??):

https://www-secure.symantec.com/connect/forums/media-freezing#comment-8792431

Your question:
... is there any way in which i can retrieve the lost data from thoses medias which have encountered this problem?

Martin:
This is easy to check, 
bpmedialist -m <media id> -mcontents

You:
.... they were NDMP backups

Martin:
2.  If you have ndmp devices sharing the drives with media servers, if the scsi reservation type set in NBU is different to that set on the ndmp devices, this issue can occur.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

sriharishkandula's picture

Sorry for the late reply guys..being busy with a few Audit check.

 

Thanks for your valuable support..the issue is now resolved.

 

The issue was mainly due to a couple of faulty tape drives which were replaced and even we had upgraded the library and drive firmware to the latest version.

 

Now everything is fine as of now..Hope the continues even in the future.

SOLUTION
Ankit Maheshwari's picture

Sriharishkhandula-- Please mark post as solution..

 

 

Ankit Maheshwari