RALUS: This listener will no longer be able to listen on its IPv4 address.
Hello
We are using BE 11d with RALUS version 11.0.6235.0 on Linux (SLES 9).
Backup are working correctly and all seems to be installed ok (we have about 12 linux servers).
But: sometimes, we got error in the job log saying that the server was not able to connect to a client. The linux client is not all the time the same.
I started investigation and found that it seems that, on RALUS, the socket is really unstable/sensitive.
We found a way to reproduce the bug by doing a simple connect port scan on the RALUS 10000/tcp port.
Before giving you the debug log, I'd like to say that I verified that:
- be user is ok: uid=0(root) gid=0(root) groupes=0(root),1003(beoper)
- entries in /etc/services for grfs (6101/tcp) & ndmp (10000/tcp) are ok
- ralus configuration file is there and content is correct.
- I also tried inactivating ipv6
- I was not able to reproduce the bug on windows remote agent port 10000/tcp
Here is the full debug log:
# /opt/VRTSralus/bin/beremote --log-consoleb6ae56c0 Tue May 8 10:59:14 2007 : Starting BE Remote Agentb6ae56c0 Tue May 8 10:59:14 2007 : Requested no generation of log fileb6ae56c0 Tue May 8 10:59:14 2007 : No configuration file specified. Using default.b6ae56c0 Tue May 8 10:59:14 2007 : Log to console: enabledb6ae56c0 Tue May 8 10:59:14 2007 : Successfully set the supplementary groups of the processb6ae56c0 Tue May 8 10:59:14 2007 : Starting NDMP processorb6ae56c0 Tue May 8 10:59:14 2007 : NDMPDMainThreadFunc spawned: grpid=1, tid=-1230361680b6aa2bb0 Tue May 8 10:59:14 2007 : FS_InitFileSysb6aa2bb0 Tue May 8 10:59:14 2007 : libbedsnt5.so could not be loaded: 0x 2 (2)b6aa2bb0 Tue May 8 10:59:14 2007 : libbedssql2.so could not be loaded: 0x 2 (2)b6aa2bb0 Tue May 8 10:59:14 2007 : libbedsxchg.so could not be loaded: 0x 2 (2)b6aa2bb0 Tue May 8 10:59:14 2007 : libbedsxese.so could not be loaded: 0x 2 (2)b6aa2bb0 Tue May 8 10:59:14 2007 : libbedsmbox.so could not be loaded: 0x 2 (2)b6aa2bb0 Tue May 8 10:59:14 2007 : libbedspush.so could not be loaded: 0x 2 (2)b6aa2bb0 Tue May 8 10:59:14 2007 : libbedsnote.so could not be loaded: 0x 2 (2)b6aa2bb0 Tue May 8 10:59:14 2007 : libbedsmdoc.so could not be loaded: 0x 2 (2)b6aa2bb0 Tue May 8 10:59:14 2007 : libbedssps2.so could not be loaded: 0x 2 (2)b6aa2bb0 Tue May 8 10:59:14 2007 : libbedsupfs.so could not be loaded: 0x 2 (2)b6aa2bb0 Tue May 8 10:59:14 2007 : libbedsshadow.so could not be loaded: 0x 2 (2)b6aa2bb0 Tue May 8 10:59:14 2007 : libbedsoffhost.so could not be loaded: 0x 2 (2)b6aa2bb0 Tue May 8 10:59:14 2007 : loaded libbedsvx.sob6aa2bb0 Tue May 8 10:59:14 2007 : loaded libbedsrman.sob6aa2bb0 Tue May 8 10:59:14 2007 : libbedsagnt.so could not be loaded: 0x 2 (2)b6aa2bb0 Tue May 8 10:59:14 2007 : loaded libbedssms.sob6aa2bb0 Tue May 8 10:59:14 2007 : loaded libbedssmsp.sob6aa2bb0 Tue May 8 10:59:14 2007 : libbedsra.so could not be loaded: 0x 2 (2)b6aa2bb0 Tue May 8 10:59:14 2007 : libbedsdb2.so could not be loaded: 0x 2 (2)b6aa2bb0 Tue May 8 10:59:14 2007 : loaded libbedsedir.sob6aa2bb0 Tue May 8 10:59:14 2007 : Initializing FSsb6aa2bb0 Tue May 8 10:59:14 2007 : FS 1 failed to initialize: 0xE000FE46b6aa2bb0 Tue May 8 10:59:14 2007 : Function called: RMAN_InitFileSysb6aa2bb0 Tue May 8 10:59:14 2007 : Using 'UTF-8' Encoding.b6aa2bb0 Tue May 8 10:59:14 2007 : VXMS Initialization OK.b6aa2bb0 Tue May 8 10:59:14 2007 : ndmpRunb6aa2bb0 Tue May 8 10:59:14 2007 : Successfully resolved the "ndmp" service to port: 10000 (host order)b6aa2bb0 Tue May 8 10:59:14 2007 : NrdsWatchThread spawned: grpid=1, tid=-1240663120b6aa2bb0 Tue May 8 10:59:14 2007 : NrdsAdvertiserThread spawned: grpid=1, tid=-1249055824b6aa2bb0 Tue May 8 10:59:14 2007 : NrdsAdvertiserThread spawned: grpid=1, tid=-1257448528b6aa2bb0 Tue May 8 10:59:14 2007 : NDMP Daemon Running..b6aa2bb0 Tue May 8 10:59:14 2007 : Could not create an IPv6 socket at this time, reason:An error occurred during a socket creation operation: Error Code: 97, System Error Message: Address family not supported by protocolb6aa2bb0 Tue May 8 10:59:14 2007 : @@@@@@@MyCloseSocket called with sockfd = 5(0x5) retval = 0b6aa2bb0 Tue May 8 10:59:14 2007 : SocketService::IsSystemDualIP: No support found for IPv6 currentlyb6aa2bb0 Tue May 8 10:59:14 2007 : Started NDMP Listener on port 10000b58cebb0 Tue May 8 10:59:24 2007 : NrdsAdvertiserThread: advertisement cycle started.b58cebb0 Tue May 8 10:59:24 2007 : RMAN_EnumSelfDLE: AgentConfig GetOracleDBNames returned error. If Oracle Agent is installed, please run AgentConfig.b58cebb0 Tue May 8 10:59:24 2007 : NrdsAdvertiserThread: EnumSelfDLE for file system 14 returned 0(0x0) and 0 DLEsb58cebb0 Tue May 8 10:59:24 2007 : BENetConfigEx: Successfully refreshed adapter information.b58cebb0 Tue May 8 10:59:24 2007 : NrdsAdvertiserThread: EnumSelfDLE for file system 22 returned 0(0x0) and 1 DLEsb58cebb0 Tue May 8 10:59:24 2007 : NrdsAdvertiserThread: Nrds Message Len : 128.b58cebb0 Tue May 8 10:59:24 2007 : VX_RemoveDLE: DestroyDLE()b58cebb0 Tue May 8 10:59:24 2007 : ConnectToServerEndPoint: dest=10.17.76.2, service=6101b58cebb0 Tue May 8 10:59:24 2007 : CreateConnection type=0 on socket 5 via BESocket OKb58cebb0 Tue May 8 10:59:24 2007 : NrdsAdvertiserThread: send of svrvpgsqlqc1.cptaq.local type 22 subtype 0 to target=10.17.76.2 port=6101 succeededb58cebb0 Tue May 8 10:59:24 2007 : ConnectToServerEndPoint: dest=svrbkqc.cptaq.local, service=6101b58cebb0 Tue May 8 10:59:24 2007 : CreateConnection type=0 on socket 5 via BESocket OKb58cebb0 Tue May 8 10:59:24 2007 : NrdsAdvertiserThread: send of svrvpgsqlqc1.cptaq.local type 22 subtype 0 to target=svrbkqc.cptaq.local port=6101 succeededb58cebb0 Tue May 8 10:59:24 2007 : NrdsAdvertiserThread: advertisement cycle complete. Waiting 240 minutes before advertising again.b50cdbb0 Tue May 8 10:59:29 2007 : NrdsAdvertiserThread: negative (purge) advertisement cycle started.b50cdbb0 Tue May 8 10:59:29 2007 : NrdsAdvertiserThread: no purge is pending.b50cdbb0 Tue May 8 10:59:29 2007 : NrdsAdvertiserThread: negative (purge) advertisement cycle complete. Waiting 240 minutes before advertising again.b6aa2bb0 Tue May 8 10:59:46 2007 : SocketServices::Accept(): Could not create a BETCPConnection object from the given socket handle: An error occurred during a certain socket operation: Error Code: 107, System Error Message: Transport endpoint is not connectedb6aa2bb0 Tue May 8 10:59:46 2007 : @@@@@@@MyCloseSocket called with sockfd = 5(0x5) retval = 0b6aa2bb0 Tue May 8 10:59:46 2007 : An error occured during an accept call.This listener will no longer be able to listen on its IPv4 address. Error Code:107b6aa2bb0 Tue May 8 10:59:46 2007 : ERROR: system call (accept): 0 - 107.b6aa2bb0 Tue May 8 10:59:46 2007 : @@@@@@@MyCloseSocket called with sockfd = 3(0x3) retval = 0b6aa2bb0 Tue May 8 10:59:46 2007 : ndmpRun: exiting
Here is the summary where the error lies:
b6aa2bb0 Tue May 8 10:59:46 2007 : SocketServices::Accept(): Could not create a BETCPConnection object from the given socket handle: An error occurred during a certain socket operation: Error Code: 107, System Error Message: Transport endpoint is not connectedb6aa2bb0 Tue May 8 10:59:46 2007 : @@@@@@@MyCloseSocket called with sockfd = 5(0x5) retval = 0b6aa2bb0 Tue May 8 10:59:46 2007 : An error occured during an accept call.This listener will no longer be able to listen on its IPv4 address. Error Code:107b6aa2bb0 Tue May 8 10:59:46 2007 : ERROR: system call (accept): 0 - 107.b6aa2bb0 Tue May 8 10:59:46 2007 : @@@@@@@MyCloseSocket called with sockfd = 3(0x3) retval = 0b6aa2bb0 Tue May 8 10:59:46 2007 : ndmpRun: exiting
The process is still here after the bug but "netstat -lntp" commmand does not show anymore the port 10000/tcp as listening (so it's closed).
Any idea?
Thanks,
Jean-Luc Henry
I am running 11d clients on solaris and I have one server that has intermittent connection problems.
Sometimes the job is successful and sometimes it fails because of a communication error. If I restart the client and then retry the job the backup succeeds. I haven't been running the client in debug mode, but I will to see if I find anything useful. netstat usually shows that there are 2 established connections to the backup exec server.
Hello,
If you would like to try reproduce it on solaris, you could use the tool nmap. I use it on Linux, but you could also use it on windows or solaris.
1) Start RALUS (via script or via /opt/VRTSralus/bin/beremote --log-console)
2) Verify that ralus opened the socket:
3) Ask your port scanner to do a simple connect port scan on ralus port:
4) Verify the status of you network socket
No result
>
>5) You could also try to reproduce the port scan:
try this https://forums.symantec.com/syment/board/message?b...
you can solve ipv6 problem with
"ipaddrsel" command.
here is a short howto
you have to edit "/etc/inet/ipaddrsel.conf"
and you will see something like this:
# Prefix Precedence Label
::1/128 50 Loopback
::/0 40 Default
2002::/16 30 6to4
fec0::/10 27 Site-Local
fe80::/10 23 Link-Local
::/96 20 IPv4_Compatible
::ffff:0.0.0.0/96 10 IPv4
here is the problem: (10 IPv4)
::ffff:0.0.0.0/96 10 IPv4
change it to something higher than anything else in the file... i used 60
::ffff:0.0.0.0/96 60 IPv4
and reboot your server...
the solution was found by Cyril Plisko from www.mountall.com
Message Edited by icarus-il on 08-02-200704:52 AM
How to solve similar problem
How to solve similar problem on RHEL Linux ?
Same problem on Centos
I am also looking for a resolution on centos5.4. Anything?
Would you like to reply?
Login or Register to post your comment.