Video Screencast Help
Scheduled Maintenance: Symantec Connect is scheduled to be down Saturday, April 19 from 10am to 2pm Pacific Standard Time (GMT: 5pm to 9pm) for server migration and upgrades.
Please accept our apologies in advance for any inconvenience this might cause.

Error 40 on my media server

Created: 08 Mar 2012 | 10 comments

Hey everyone I just recently started getting error 40 on my Media servers.  The exact error is below.

Error bpbrm(pid=2572) could not write FILE ADDED message to stderr network connection broken(40)

The error seems to happen at around 2 hours of the job starts so I though there was a firewall issue connecting back to my media server.  Our network team says the timeout for idle sockets is 14 hours so I don't believe that is the problem.  I am currently rerunning the back with bpbrm logs enabled on the media server to see if I can get more details.  Does anyone have any ideas as to what would cause this.  Is there a timeout setting in NetBackup that I can change to help me fix this issue.

 

I am running NetBackup 6.5.5

My Master server is running Windows Server 2003 X64

My Media server is running Windows Server 2008 R2 and has 10Gb NIC's

Comments 10 CommentsJump to latest comment

revaroo's picture

Do you have CLIENT_READ_TIMEOUT set on that media server to 7200?

How many clients is this affecting? All of them or just some?

If all of them, I suspect a network issue between Media server and clients.

Mark_Solutions's picture

The idle sockets by default can be only 2 hours - in fact exactly 2 hours and I have seen that this usually has most effect on things like SQL backups

Try this on your Media Servers (needs a reboot to take effect)

In HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\

Add a new DWORD named KeepAliveTime with a Decimal value of 510000

Add a new DWORD named KeepAliveInterval with a Decimal Value of 3

Hope this helps

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

Mark_Solutions's picture

good point by revaroo too yes

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

Marianne's picture

'could not write FILE ADDED message' looks to me like bpbrm is unable to add file list to bpdbm on the master.

Same error is described here (where I also shared personal experience):

https://www-secure.symantec.com/connect/forums/bac...

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

kproehl's picture

Thanks everyone for all the replies.  I will make sure to add some of the suggestions.  The problem just went away without me doing anything.  

 

Thanks again.

Mark_Solutions's picture

You may have had blocked ports on your Master or Media Server if it just went away

Always worth adding these to them, both canhave the first key and just the Windows 2003 ones have the second key:

In HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\

Create a new DWORD named TcpTimedWaitDelay with a Decimal Value of 30

Create a new DWORD named MaxUserPort with a Decimal Value 65534

For the Windows 2008 servers, rather than the MaxUSerPort key us an administrative command prompt to run:

netsh int ipv4 set dynamicport tcp start=10000 num=50000

Hope this also helps

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

kproehl's picture

The problem has started again and now seems to be worse then before.  I have done a couple things to try and resolve this.

I realized that I was running version 6.5.5 on my media servers which was running Windows 2008 R2.  I just found out that is not supported.  I have upgraded my master and media servers to 6.5.6 which is supported on a media server running Windows 2008 R2.

 

When i upgraded my media servers I also upgraded the NIC cards to 10 Gb.  Would this cause any problems?  Are there any tests I can run to rule this out of the mix.  I also saw were people were getting these errors when the master server is really busy.  Is there anyway to test that?  

 

I have a case open with Symantec where they had me run AppCritical.  That test showed packet loss but I later found out that all AppCritical is doing is sending ICMP and UDP packets to all the Network devices along that path.  Becuase they are doing policing these values are getting skewed and casuing innacurate results.

I did increase  the CLIENT_READ_TIMES and so far that has not helped.

Mark_Solutions's picture

Is it still after 2 hours?

Is it file system or application backups?

Did you put in place the KeepAlive settings i listed previously?

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

kproehl's picture

I do not have this is place currently.  I do have backups that run longer than 2 hours without any issues.  It should like this fix is only if all my backups fail after 2 hours.  Is that correct?

revaroo's picture

AppCritical is usally pretty accurate. I have not seen it report erroneous reports personally.

Is this just happening on one media server? If so, what happens when you send the client backups to other media servers. Do they work ok?

What type of backups are you performing on the clients?