backup job running but not writing data
Hi
master server solaris client+media server windows2003
backup job running but not writing data
job log:
07/07/2010 11:24:54 - requesting resource gaal-hcart3-robot-tld-1
07/07/2010 11:24:54 - requesting resource asp.NBU_CLIENT.MAXJOBS.gaal
07/07/2010 11:24:54 - requesting resource asp.NBU_POLICY.MAXJOBS.gaal
07/07/2010 11:28:12 - granted resource asp.NBU_CLIENT.MAXJOBS.gaal
07/07/2010 11:28:12 - granted resource asp.NBU_POLICY.MAXJOBS.gaal
07/07/2010 11:28:12 - granted resource BM0095
07/07/2010 11:28:12 - granted resource IBM.ULT3580-TD3.003
07/07/2010 11:28:12 - granted resource gaal-hcart3-robot-tld-1
07/07/2010 11:28:12 - estimated 0 kbytes needed
07/07/2010 11:28:12 - begin Parent Job
07/07/2010 11:28:12 - begin Snapshot: Start Notify Script
07/07/2010 11:28:13 - started process RUNCMD (pid=29279)
Operation Status: 0
07/07/2010 11:28:13 - end Snapshot: Start Notify Script; elapsed time 0:00:01
07/07/2010 11:28:13 - begin Snapshot: Step By Condition
Operation Status: 0
07/07/2010 11:28:13 - end Snapshot: Step By Condition; elapsed time 0:00:00
07/07/2010 11:28:13 - begin Snapshot: Stream Discovery
Operation Status: 0
07/07/2010 11:28:13 - end Snapshot: Stream Discovery; elapsed time 0:00:00
07/07/2010 11:28:13 - begin Snapshot: Read File List
Operation Status: 0
07/07/2010 11:28:13 - end Snapshot: Read File List; elapsed time 0:00:00
07/07/2010 11:28:13 - begin Snapshot: Create Snapshot
07/07/2010 11:28:20 - begin Create Snapshot
07/07/2010 11:28:18 - started process bpbrm (pid=4656)
please help
Comments
That's only 4 minutes so far!
How long have you left it?
It could still be building its list of files to backup.
If Windows client have you got the Client Job Tracker running - will give you a visual of it actually doing something (if it is!)
Regards Andy
"It's not too late to panic ..."
left it for a long 3 -4 hours
left it for a long 3 -4 hours but still no data being written on tape
tell me more about Client Job Tracker
Client Job Tracker
It's far from brilliant (indeed there are Tech Notes on how to shut it down!) but it can be started via Start>All Programs>Veritas NetBackup>NetBackup Client Job Tracker.
DOCUMENTATION: How to disable the Client Job Tracker on a NetBackup client
http://seer.entsupport.symantec.com/docs/267253.htm
Regards Andy
"It's not too late to panic ..."
restarted media servers
restarted media servers services but still no help
Heres the screenshot of the windows client job tracker
Heres the screenshot of the windows client job tracker. you can locate it in the system tray (pic-1) or from the start menu->program files (pic-2).
ker
r
Thanks
Kisad
Can you confirm your current environment?
NetBackup versions & patch levels for master/media/client, O/S's & release levels etc.
Type of backup you're trying to perform?
Has it ever worked before or is it a new set-up?
We can never get too much info but often get too little!
Regards Andy
"It's not too late to panic ..."
NBU master solaris 6.5.4 NBU
NBU master solaris 6.5.4
NBU client + media windows 2003 6.5.4
it has worked earlier failed from past 1 week
Solaris 6? Solaris 10?
Type of backup?
"Normal" MS_Windows-NT?
Backup selection?
Set to use VSS as snapshot provider?
Any errors on client Event Viewer?
Regards Andy
"It's not too late to panic ..."
try multistream
enable multistream and see if backs up or writes data for any stream.
Thanks
Kisad
Check the client
Do you have some bpbkar32.exe processes running? Any running for a long time?
You may want to kill the job from NetBackup's point of view, disable the client service on the host, wait 15 minutes, kill off any bpbkar32 processes, and then restart the service and the job.
And as Andy pointed out, make absolutely sure you're using VSS and NOT VSP.
If this post helps you, please add a vote.
If this post answers your question, please mark it as a solution. Thanks!
Good point Ed.
Have come across a few instances in the past where there've been a lot of bpbkar process just 'hanging around' from previous failed attempts that were subsequently preventing any new jobs from continuing.
Regards Andy
"It's not too late to panic ..."
guys i found this error in
guys i found this error in bpbkar log
7:40:44.421 PM: [9996.13384] <16> dtcp_write: TCP - failure: send socket (840) (TCP 10054: Connection reset by peer)
7:40:44.421 PM: [9996.13384] <16> dtcp_write: TCP - failure: attempted to send 1 bytes
7:40:44.483 PM: [9996.13384] <16> tar_base::keepaliveThread: INF - keepalive thread abnormal exit :14
7:41:00.765 PM: [14152.9048] <16> dtcp_write: TCP - failure: send socket (840) (TCP 10054: Connection reset by peer)
7:41:00.765 PM: [14152.9048] <16> dtcp_write: TCP - failure: attempted to send 1 bytes
7:41:00.780 PM: [14152.9048] <16> tar_base::keepaliveThread: INF - keepalive thread abnormal exit :14
7:41:40.797 PM: [13660.4308] <16> dtcp_write: TCP - failure: send socket (840) (TCP 10054: Connection reset by peer)
7:41:40.797 PM: [13660.4308] <16> dtcp_write: TCP - failure: attempted to send 1 bytes
7:41:40.797 PM: [13660.4308] <16> tar_base::keepaliveThread: INF - keepalive thread abnormal exit :14
plz suggest wat to do next
try increasing the timeout settings
I had a situation where one of the drives in windows box was failing with error 41. It had the exact error error message you have listed above. I multistreamed the backup and increased the client read time out of the client as suggested by symantec tech support (see below).it worked for me.
Please do note that multistreaming the job will help in isolating the issue, it is not necessary that all of your drives are giving the same error.
The first item to troubleshoot is why this client is only failing on this drive. Could it be due to disk fragmentation on this drive? Are there millions of small files? A possible cause can be file corruption on the drive itself. Therefore, lets start by extending the timeout to see if the job progresses. The NetBackup client may not be able to send keepalives to the media server if the read of client data is taking a long time to complete one buffer of data. When this occurs, the media server may timeout (resulting in a status code 13, 40 or even 41) after the time specified in the media server's Host Properties > Media Server and/or Clients> Timeouts > CLIENT_READ_TIMEOUT configuration. (300 seconds by default).
Thanks
Kisad
NIC card driver
Couple things i can suggest.
1. Try backing up just a folder[C:\Temp or something] instead of ALL_LOCAL_DRIVES
OR
2. Update or re-install NIC card driver on the windows client.
Any resolution to this?
Regards Andy
"It's not too late to panic ..."
Would you like to reply?
Login or Register to post your comment.