Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.

Backup failing with error filewrite failed (14)

Created: 11 Feb 2013 | 31 comments

Hi ,

Recent backups are getting failed on 3-4 VM's with error File write failed (14), Please let me know what can be done to fix this. 

2/12/2013 12:32:29 AM - connected; connect time: 00:00:02
2/12/2013 12:44:57 AM - Critical bpbrm(pid=1760) from client LEXPYD: FTL - tar file write error (10054)   
2/12/2013 12:45:01 AM - end writing; write time: 00:12:34
file write failed(14)
 
2/12/2013 12:47:37 AM - connected; connect time: 00:00:01
2/12/2013 1:03:54 AM - Critical bpbrm(pid=6064) from client LEXTAIIII: FTL - tar file write error (10054)   
2/12/2013 1:03:55 AM - Error bpbrm(pid=6064) could not send server status message       
2/12/2013 1:03:57 AM - end writing; write time: 00:16:21
2/12/2013 1:04:16 AM - end Snapshot , Delete Snapshot; elapsed time: 00:00:18
Status 14
2/12/2013 1:04:16 AM - end Create Snapshot; elapsed time: 00:21:40
file write failed(14)

Comments 31 CommentsJump to latest comment

RamNagalla's picture

1) does these VMs ever work fine?

2) what is the transport method you are using?

3) what are the versions of Netbackup and Vspear/ESX?

ramkr2020's picture

yes those VM"s were working fine for years , problem started last week. ESX v 3

amol amodkar's picture

Hi,

tar file write error 10054 means a network issue - the network connection has been broken or there is no communication

Check ping / dns / reverse lookup and firewalls

RamNagalla's picture

what is the netbackup version?

whats been changed form last week/?

any upgrades?

just want to make sure if you are using vmware backup or agent based backup?

watsons's picture

1) If it's Windows, check if UAC is turned off.

2) If you monitor via a GUI, try to "run as administrator".

3) Check if there is any diskspace issue on the media server.

ramkr2020's picture

getting the error as per attached file, Status : file write failed (14) is the status of failed job. no issues on network since other servers and other drives on the servers are getting backedup fine. 

backup.PNG
RamNagalla's picture

Did the other successfull server using the same media server and Storage unit?

if you are using VMware method of backup, does it SAN transport of NBD?

ramkr2020's picture

yes, other vm's are using same media server and storage unit and was running success all these days. 

only all local drives are getting backed up not the whole VM. 

Yasuhisa Ishikawa's picture

Can you tell us more about your backup like ESX version(including minor number) and patch level, NetBackup version and release update, backup type(VCB or VADP or any other), detail of you backup polict, etc...?

Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan

ramkr2020's picture

ESX version is Esx 3.5.0, 153875

Netbackup - Version 6. 0

Backup type: Only all local drives are getting backed up not the whole VM. sorry not sure how to find what backup type VCB or VADP, how can i check it. 

SachinSharma's picture

When you mean other drives are backing up fine?

Do you mean to say that other drives of the failing client are backing up fine and it is one of the drives which is failing?

1) Did you try log into the server?

2) If 'yes' check the disk space on the server of the failing drive it might be full?

Sachin Sharma

ramkr2020's picture

Hi All,

Still receiving filewrite failed status - 14 error and backup failing.. Any hep would be appreciated. 

RamNagalla's picture

sorry not sure how to find what backup type VCB or VADP, how can i check it. ?

what is the policy type for the failing client?

 does it windows NT?

if that is windows NT, you are using the Netbackup agent based backup, where you install the client software in Client and try to make backup.

if the policy type is Flashbackup or Vmware something like that.. you might be going with VCB or VADP method of backups.

I am assuing your policy type is Windows NT( confirm it)

what is netbckup version on Master/media  and client?

what is the OS version of master media and client?

what are the tape Drives you are using?

does this failurs are limitted to the  one client or most of the clients that are using same media server and storege unit?

and also try to increase the read timeout from Media server 

please provide the bpbrm, bptm logs from media server and bpbkar and bpcd logs from client.

Yasuhisa Ishikawa's picture

To confirm your backup configuration, please post output of "bppllist your_policy -L" command.

Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan

Mark_Solutions's picture

A few things here as you clearly have issues ....

1. I see status 200 errors so you have something worng with configurations

2. V6 is End Of Life - you really need to upgrade as soon as possible for both support and to elimiate bugs

3. Status 14 (10054) is a network error and quite possible a timeout issue. Check the VMWare Admin Guide in its troubleshooting section to see how to vary VMWare timeouts - but that all depends on what sort of policy type you are using to back them up. It may be that data has grown of the ESX servers are busier these days and that is causing the timeouts -  also best to restrict your jobs to no more than 6 per datastore at anyone time to help reduce this type of error.

Let us know the policy type (output) so that we can advise further

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

ramkr2020's picture
Please find the attached bptm,bpbrm bppllist logs for a policy.. other policies are also getting the same error 14.. 
 

I am assuing your policy type is Windows NT( confirm it) yes

what is netbckup version on Master/media  and client? V6

what is the OS version of master media and client?Windows server 2003 R2

what are the tape Drives you are using? Overland

does this failurs are limitted to the  one client or most of the clients that are using same media server and storege unit? most of the clients.. 

 
 
 
AttachmentSize
bppllist.txt 2.73 KB
bpbrm.zip 231.74 KB
RamNagalla's picture

Did you try increasing timeout values as i suggested about?

did your Media server also VM or its a physical server?

check just sending one job to the media server and see how it goes

ramkr2020's picture

Media server is physical , have tried sending one job but it still fails.. client connect time out is 600s but it still fails, pls analyse the logs and let me know if there is any fix. 

Yasuhisa Ishikawa's picture

Regarding your NetBackup version, this backup is not VADP and VCB.

some of child bptm suddenly disappeared without recording 'wait and delay counter'. Are there any relevant messages in event log?

Can you set logging level to 5, and collect bpbkar(client), bpbrm(server), bptm(server) again?

Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan

ramkr2020's picture

I can see the following event id in event viewer.. 

Even ID 5122 -DeviceIoControl() error on bus 0, target 2, lun 0: The device is not connected. 

Event ID 4237 - cannot get serial number for HP.ULTRIUM4-SCSI.000 (device 0, SCSI coordinates {2,0,1,0}, \\.\Tape0)

Event ID 4228 - Unable to retrieve drive's cleaning media type. status = 5 

Event ID 5075 - TLD(1) going to DOWN state, status: Unable to sense robotic device

Mark_Solutions's picture

Looks like your tape drive has issues then - it tends to indicate that it needs cleaning and has gone down

When the drive goes down in the middle of the backup then it can no longer write the backup - hence the tar file write error and the disconnect via bptm.

I would suggest a new cleaning tape is put in and run through your drives 2 or 3 times and then see if that resolves your issues

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

ramkr2020's picture

i was able to clean drive 1 without any issues, but while trying to clean drive 2 getting below error. 

"The following error occured while attempting to clean drive 001 on server, request terminated because media is unavailable (in down drive, misplaced, write protected or unmountable) (291)... 

i used the same cleaning media on other drive , its successful .. attached the screenshots for job status

Have noticed the below event error too. 

Event ID: 11-The driver detected a controller error on \Device\Scsi\adpu160m1. Source: adpu160m

backup.PNG
ramkr2020's picture

tried the new tapes and cleaned the drive 1 and 2 , 2to 3 times but still getting error file write failed (14)

2/25/2013 5:41:26 PM - Critical bpbrm(pid=4480) from client server: FTL - tar file write error (10054)   
2/25/2013 5:41:29 PM - end writing; write time: 02:21:01
file write failed(14)
 
Please advice.. Need this to be fixed soon.. 
ramkr2020's picture

Any one knows how to fix "The following error occured while attempting to clean drive 001 on server, request terminated because media is unavailable (in down drive, misplaced, write protected or unmountable) (291)... 

while trying to clean the drives getting above error. 

William Jansen van Nieuwenhuizen's picture

Hi Ramkr2020

You should really start another forum post for a different issue. But I'll try and help you quick:

  • Check that there is a Cleaning Cartidge in the robot.
  • Check that the drives are up in the devices part of the gui.
  • Do you clean manually from the front of the robot or do you rely on netbackup to do it. If you do it manually you should set netbackup to IGNORE TAPE ALERTS. There should be a technote for it on the internet. (http://www.symantec.com/business/support/index?pag...)
ramkr2020's picture

Hi ,

Got the tape cleaning fixe after replacing new cleaning tape. thanks

ramkr2020's picture

noticed 10054 errors in bpbkar logs of media&master server. 

emp'

10:21:53.207 AM: [7088.5368] <16> dtcp_write: TCP - failure: send socket (1888) (TCP 10054: Connection reset by peer)
10:21:53.207 AM: [7088.5368] <16> dtcp_write: TCP - failure: attempted to send 
 
Attached the bpkar,bpbrm and bprd logs. Please analyze and help in fixing this. 
 
 
AttachmentSize
bpbrm.zip 152.55 KB
William Jansen van Nieuwenhuizen's picture

Hi Ramkr2020

Are you still facing the same issue, with trying to restore VM clients/guests? If not, you should open a new thread and mark a solution in this thread.

If you are still facing the same issue. What is in the job logs details. If you are having issues with vmclients. Please post the bpvmutil log.

Thanks

William

ramkr2020's picture

Yes i am facing the same problem which is started in this thread.. please help. 

William Jansen van Nieuwenhuizen's picture

Hi

No problem then, can you provide the requested info.

If you are still facing the same issue. What is in the job logs details. If you are having issues with vmclients. Please post the bpvmutil log.

Mark_Solutions's picture

Ok - can you tell me a couple of things please .. based on the drives / robot reporting not being connected ...

1. On the Media Server do you have the Removable Storage Management Service both stopped and disabled

2. Do you have the AutoRun registry key with a value of 0 set for you tape drivers .. see this tech note:

http://support.microsoft.com/kb/842411

If not make sure you do both and reboot the media server - then see how it goes

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.