A status 58 error can occur during connections to legacy processes; bpcd, bprd, bpdbm, bpjobd, vmd, etc.
- typically from master or media server to client, but also
- from master server to media server, or
- from media servers to master, or
- from client to master servers, or
- in very rare cases from client to media server.
The cause is that either
- a TCP SYN request did not reach the destination host, or
- the expected processes were not listening on the destination host, or
- the TCP SYN+ACK returned by the destination host did not reach the source host, or
- the destination host could not validate that the inbound connection was from a valid NetBackup (NB) host, or
- that PBX or vnetd on the destination host could not transfer the connection the desired daemon, or
- in rare instances that the destination host could not create a connection to the source host - for the same reasons as noted above.
The majority of these situations are due to the TCP port(s) not being open through the firewall, daemon processes not running and listening on the destination host, the destination host being unable to resolve the IP address of the source host to a hostname, or the destination host not being configured to recognize the resolved hostname as a valid NB host.
Checking Connectivity from the Master Server to the Client
When troubleshooting status 58 errors to a NetBackup host, the first thing to test is to whether or not you can access the client Host Properties for that host from the master server. Connecting to get the host properties uses the same ports and daemon processes as a backup or restore. If it works the client daemon processes are running and listening on the destination host, the TCP ports are open between these two hosts, and the destination host can resolve the IP of the master and recognizes it as a valid server.
If Host Properties works and a client backup is using a storage unit on the master server, it should work without getting the status 58 error.
Checking Connectivity to the Client Daemons on the Destination Host
NB 6.x and newer servers can use the bptestbpcd command to verify connectivity to the destination host, e.g.
bptestbpcd -verbose -debug -client <policy_client_or_stu_residence>
See related articles for additional details about bptestbpcd.
Checking Name Resolution
If still getting status 58 errors, the problem may be with the hostname <--> IP name resolution using either DNS or the host files.
To test name resolution, run the following command on the NB host that is initiating the connection. Be sure to check all media servers that can contact a client:
<install path>/netbackup/bin/bpclntcmd -hn <policy_client_or_stu_residence>
Since reverse lookups is part of the NB connections make sure the reverse resolution also works and returns the same hostname:
<install path>/netbackup/bin/bpclntcmd -ip <IP_address_returned_above>
On the destination host, perform the same tests, but in the reverse order after first determining the IP addresses assigned to the source host. On clients, these commands should be run for all master and media server network interfaces that should be used to connect to the client.
<install path>/netbackup/bin/bpclntcmd -ip <server_source_IP_address>
<install path>/netbackup/bin/bpclntcmd -hn <returned_hostname>
If the forward or reverse name resolution is not correct and consistent, then correct the name resolution configuration used by the host. NetBackup will operate most efficiently when name resolution matches the NetBackup configuration so that hosts can be identified unambiguously and consistently.
Note: If the name resolution is intermittently wrong, then check for a failover DNS server with an incorrect forward or reverse lookup record. Also be aware, that the incorrect information returned will be cached by NetBackup for up to an hour and may affect a subsequent job retry.
Pinging the Destination Host
Check if you are able to "ping" the destination host IP address from the source host.
This should be possible unless ICMP traffic is not permitted on the network. If this fails, consult with your Network Administrator and server System Administrator. If ICMP traffic is not blocked, they can resolve the layer 3 or IP network connectivity issue. Also double check the IP address and netmask assigned to the network interfaces (NICs), intended to be used for the connection, to ensure they are configured correctly.
Checking Connectivity to the Master from the Server or Client
The bpclntcmd -pn command is very useful for checking name resolution and connectivity to the master server. When executed, the command does the following.
1. Gets the first server listed in the local servers list and, knowing it is the master server, does a forward lookup of that hostname using the local name services configuration. Then also determine the TCP port number for the daemon process on the destination host; veritas_pbx (default for NB 7.1+), vnetd (default for NB 6.0-7.0), bprd.
2. If successful you should see the message "expecting response from <first_server>".
3. If the connection to the master server is via PBX or vnetd, the inbound connection will be transferred to bprd. Then bprd will perform two functions. It first does a reverse lookup of the incoming source IP address, that is the first hostname displayed on the second line of the command output. Then it checks if the resolved hostname is in the server list (current server). If not, it queries the policy database to see if that hostname is used as a client in any policy (current client). If not it then queries the image database to see if that hostname has backup images (past client). If found, the matched client is the second hostname displayed on the second line of the command output.
In this example the client is otto.veritas.com, the master server is hal.veritas.com, and 'otto' appears as a client in a policy. Be aware that this type of configuration is discouraged because the mixed use of short names and fully qualified domain names (FQDNs) for the same host may cause some NetBackup operations to fail since a derived FQDN can safely shortened for comparison to a configured shortname, but a derived shortname cannot be extended to a FQDN safely.
expecting response from server hal.veritas.com
otto.veritas.com otto 10.82.110.6 3412
If the connection hangs or fails, take the following steps.
- Create a debug log directory on the source host and set verbose to 5. It should show the hostname and service name resolution and the outbound connection to the master server. The debug log is named bplist prior to NB 7.6, and bpclntcmd at NB 7.6 and above.
- Create a bprd debug log on the master server and set verbose to 5. It should show the incoming connection, the reverse IP address to hostname resolution, and the check to determine if the resolved hostname is a known server or client.
Are Daemon Processes Running?
Inbound connections are normally made via a super daemon listener which listens on the behalf of other processes. On NetBackup versions 7.0.1 (UNIX) or 7.1 (Windows) or newer, the connections will be made via the pbx_exchange process and then transferred to the destination process or processes. On NetBackup 6.0 - 7.0, the connection will be made via the inetd/xinetd (UNIX) or bpinetd (Windows) daemon processes, which will then start vnetd, which will then either start bpcd or transfer the connection to an already running NB server daemon processes.
The destination process will typically be vnetd & bpcd or bprd (but possibly bpdbm, bpjobd, vmd, etc). At NB versions 7.0.1 (UNIX) and 7.1 (Windows) the bpcd and vnetd processes become persistent daemons like bprd, bpdbm, etc.
Use the ps command (UNIX) or Task Manager (Windows) to confirm the expected daemon processes are running on the destination host. Make a note of the process IDs (PIDs), on Windows it may be necessary to add the Process ID column to the display.
Checking Daemon Listening Status
Ensure the destination daemons are listening on the destination host; typically PBX, vnetd, and bpcd on TCP ports 1556, 13724, and 13782 respectively. The '-o' options on Windows and '-p' option on Linux will include the PID of the listening process. If present it should match the ones noted above from the ps or taskmgr commands. If not, a third-party process is utilizing the NetBackup ports. This is not an issue if the PBX or vnetd port is open and available.
On Windows hosts:
netstat -a -o | FINDSTR LISTEN
TCP otto:1556 otto.veritas.com:0 LISTENING 2848
TCP otto:vnetd otto.veritas.com:0 LISTENING 7263
TCP otto:bpcd otto.veritas.com:0 LISTENING 3758
On UNIX hosts:
netstat -a -p | grep LISTEN
*.vnetd *.* 0 0 49152 0 LISTEN 7326/vnetd
*.1556 *.* 0 0 49152 0 LISTEN 1839/pbx_exchange
*.bpcd *.* 0 0 49152 0 LISTEN 2748/bpcd
Note: Use the 'netstat -n' option to suppress service name resolution and display the TCP port numbers instead.
Note: See the version specific NetBackup Port Usage Guide for the other daemons and associated port numbers.
Checking for Firewall Issues
The netstat -a -n command can also be useful to determine if a firewall is blocking TCP connections.
First, initiate a connection from the source host to the TCP port number of the destination daemon on the destination host. E.g.
telnet myclient 1556
Then check the netstat -n -a output on both hosts immediately. If you wait more than a second or two, the connection may be torn down before you can see it's status.
Ideally, the output on both hosts will show the connection as ESTABLISHED. But more likely, the source host will show a connection in SYN Sent state. That means the TCP stack on the source host sent a TCP SYN packet onto the network, but did not receive a reply.
tcp 0 0 192.168.70.205:4480 192.168.70.198:13724 SYN_SENT
Check the destination host. If it does not show the same connection, then either a firewall blocked the inbound TCP SYN or the network delivered the packet to a different host. Use the traceroute (UNIX) or tracert (Windows) commands from the source host to the destination IP address to determine if the network route is correct.
If the destination host shows the connection, but in a SYN Received state, then a firewall blocked the TCP SYN+ACK that was returned outbound by the destination host.
tcp 0 0 192.168.70.198:1556 192.168.70.205:4480 SYN_RECV
If the connection is not show in the netstat output from either host, then typically some networking layer component rejected the connection and returned a TCP RESET. Use a TCP packet capture (e.g. snoop, tcpdump, Wireshark) to capture first the outbound TCP SYN on the source host, then the inbound TCP SYNC on the destination host, and then the outbound reply.
If the connection is not shown in the netstat output from the source host but is shown in a TCP Time Wait state on the destination host, then the connection was briefly ESTABLISHED, but the NetBackup daemon process closed the connection. See the debug log for that processes for additional details.
tcp 0 0 192.168.70.198:1556 192.168.70.205:7432 TIME_WAIT
Note: The netstat output may show an unexpected source IP address for the connection. If the source IP address does not appear to be correct, especially if the two hosts are on the same network, then make two checks. First, check to see if the NetBackup configuration on the source host contains a setting that will override the default source binding; i.e. REQUIRE_INTERFACE, REQUIRED_NETWORK, PREFERRED_NETWORK, or CLUSTER_NAME without ANY_CLUSTER_INTERFACE=YES. Those setting forces outbound TCP SYN requests to use the configured source IP address, which may differ from the source host routing table and thereby conflict with firewall rules. Either remove/correct the NetBackup configuration or extend the firewall configuration as appropriate. Second, in the absence of a NB source binding, check if there is a static host route or static network route that causes the OS to select the unexpected source IP.
Typically, any conflicting firewalls are located within the network cloud. However, they can also reside on the end-stations and cause problems.
For NB 6.x and earlier UNIX hosts, check the /etc/inetd.conf file to see if there are any TCP wrappers present. This example shows a TCP wrapper (tcpd) running on the bpcd port.
bpcd stream tcp nowait root /usr/local/bin/tcpd /usr/openv/netbackup/bin/bpcd bpcd
On UNIX, check the hosts.allow and hosts.deny files, they are typically located in the /etc directory.
For Linux hosts, check for any iptables configuration that is performing unexpected packet filtering.
On Windows, temporarily disable any firewalls (Domain, Private, or Public) or security intrusion packages that are installed. If connections are then possible, update the firewall configuration with appropriate exceptions for the needed TCP ports.
Checking for Resilient Network Conficts
If the Resilient Network feature, introduced in NetBackup 7.5, is configured for the destination host, then all connections will use only the vnetd port which must be open bi-directional between the NB servers and the client. PBX will not be used. Check for this configuration on the master server. An empty value indicates that it is not configured.
Checking that the Daemon Binary is Not Corrupt
For vnetd and bpcd connection problems, on destination hosts, confirm that the executable binary file is not corrupt.
On the destination host, create the debug log directory for the process, set verbose to 5, and try to start the daemon from the command line. Be sure the program executes with appropriate administrative privileges. E.g.
For bpcd processes started from inetd/xinetd/bpinetd, that should generate the following debug log entries and prove the binary can be executed.
16:47:45.986 [5296.3376] <2> bpcd main: offset to GMT 21600
16:47:45.986 [5296.3376] <2> bpcd main: Got socket for input 3
16:47:46.017 [5296.3376] <2> logconnections: getsockname(3) failed: 10038
16:47:46.017 [5296.3376] <16> bpcd setup_sockopts: setsockopt 1 failed: h_errno 10038
16:47:46.017 [5296.3376] <2> bpcd main: setup_sockopts complete
16:47:46.158 [5296.3376] <2> vauth_acceptor: ..\libvlibs\vauth_comm.c.332: Function failed: 17 0x00000011
16:47:46.158 [5296.3376] <16> bpcd main: authentication failed: 17
For bpcd standalone, the messages are slightly different.
16:37:50.399  <2> setup_debug_log: switched debug log file for bpcd
16:37:50.399  <2> bpcd main: VERBOSE = 0
16:37:50.399  <2> logparams: /usr/openv/netbackup/bin/bpcd
16:37:50.408  <2> process_requests: offset to GMT 21600
16:37:50.480  <8> vnet_get_peer_sock_names: [vnet_nbrntd.c:263] getsockname() failed 88 0x58
16:37:50.480  <8> get_peer_or_sock_name: [vnet_nbrntd.c:710] vnet_get_peer_sock_names() failed 10 0xa
16:37:50.480  <2> logconnections: nb_getsockname(0) failed
16:37:50.480  <2> process_requests: setup_sockopts complete
16:37:50.481  <8> vnet_get_peer_sock_names: [vnet_nbrntd.c:263] getsockname() failed 88 0x58
16:37:50.481  <8> get_peer_or_sock_name: [vnet_nbrntd.c:710] vnet_get_peer_sock_names() failed 10 0xa
16:37:50.481  <16> bpcd peer_hostname: nb_getpeername: Socket operation on non-socket
16:37:50.481  <16> process_requests: Couldn't get peer hostname
It is normal to have the PID end with an error as in the above examples, the program was expecting to be passed an inbound connection, but was not.
Checking Local Connectivity
If the daemon process appears to be listening, but not accepting connections from remote hosts, then test a local connection on the destination host to itself. Typically first to PBX, then to the specific port for the destination daemon. E.g.
telnet 127.0.0.1 1556
telnet 127.0.0.1 13782
The telnet to bpcd on the loopback interface should generate a debug log that looks like this:
16:49:35.352 [3336.4360] <2> bpcd main: offset to GMT 21600
16:49:35.352 [3336.4360] <2> bpcd main: Got socket for input 376
16:49:35.352 [3336.4360] <2> logconnections: BPCD ACCEPT FROM 127.0.0.1.3845 TO 127.0.0.1.13782
16:49:35.352 [3336.4360] <2> bpcd main: setup_sockopts complete
16:49:35.414 [3336.4360] <2> bpcd peer_hostname: Connection from host localhost (127.0.0.1) port 3845
16:49:35.414 [3336.4360] <2> bpcd valid_server: comparing hal.veritas.com and localhost
16:49:35.414 [3336.4360] <4> bpcd valid_server: localhost is not a master server
16:49:35.414 [3336.4360] <16> bpcd valid_server: localhost is not a media server either
16:49:39.189 [3336.4360] <16> bpcd main: read failed: The operation completed successfully.
The error above is expected because telnet did not provide the expected input per the bpcd protocol. Notice also that 'localhost' is correctly determined not to be a valid NetBackup server.
For Linux clients, if they are missing a library file required by bpcd or vnetd, you would get this type of error message:
telnet localhost 13782
Connected to clientname.domainname.com
Escape character is '^]'.
bpcd: error while loading shared libraries: libstdc++-libc6.2-2.so.3:
cannot open shared object file: No such file or directory
Contact the OS vendor to obtain the required library file.
If the telnet to 'localhost' works, try to telnet to the same TCP port from the source host.
telnet <destination_host_IP> 13782
That should generate a log entry as seen below showing the source host being recognized as a valid NB server.
16:52:46.077 [1160.5436] <2> bpcd main: offset to GMT 21600
16:52:46.077 [1160.5436] <2> bpcd main: Got socket for input 400
... The client sees the incoming connection from source IP address 10.82.105.254 ...
16:52:46.077 [1160.5436] <2> logconnections: BPCD ACCEPT FROM 10.82.105.254.44554 TO 10.82.110.6.13782
16:52:46.077 [1160.5436] <2> bpcd main: setup_sockopts complete
... Performs a successful reverse lookup of the incoming IP address and gets the hostname ...
16:52:46.092 [1160.5436] <2> bpcd peer_hostname: Connection from host hal.veritas.com (10.82.105.254) port 44554
... Then compares the resolved hostname to the server list ...
16:52:46.092 [1160.5436] <2> bpcd valid_server: comparing hal.veritas.com and hal.veritas.com
... The hostname compare succeeds ...
16:52:46.092 [1160.5436] <4> bpcd valid_server: hostname comparison succeeded
16:52:49.476 [1160.5436] <16> bpcd main: read failed: The operation completed successfully.
If any of these telnet tests fails to generate a log entry then there is something outside of NetBackup that is preventing access to the daemon. Most likely firewall software of some sort; see the Firewall topics above for some possibilities.
Checking for Name Resolution Delays
When testing connections to bpcd and other daemons be sure to check the time difference delta between the "logconnections: <process> ACCEPT FROM <source> TO <destination>" and the next few log messages that normally would show any IP address or hostname resolution. If there are any delays of more than a second or two, that may cause the connecting process on the source host to timeout. The responsiveness of the name services should be improved or appropriate host file entries created to prevent excessive latency.
Checking for Other Unexpected Connection Delays
By default, NetBackup versions 7.0.1 (UNIX) and 7.1 (Windows) or newer will try to connect to legacy daemons using first the PBX port, then the vnetd port, and then finally the daemon port. If PBX is not reachable there will be a 10 second delay on some operating systems before vnetd is attempted. If vnetd is not reachable there will be a similar 10 second delay before the daemon port is attempted. These delays can cause other timeouts to occur. See TECH162303 for additional details.
Checking for Unexpected Connect Options
If NetBackup is observed to be connecting directly to the daemon ports on a remote host, without first trying to connect via the PBX port (post-7.0.1/7.1) or vnetd port (post-5.1), then check to see if there are unexpected Connect Options configured. There are three places this setting could be configured.
First, check for Client Connect Options on the master server.
To check for client connection options from the NetBackup GUI:
1. Launch the NetBackup Administration Console, connecting to the master server
2. Expand Host Properties in the left pane
3. Select Master Server in the left pane
4. Click the name of the master server in the right pane
5. Select the Client Attributes section
6. See if the destination hostname is present, if it is, continue.
7. Click the name of the destination host.
8. In the Connect Options tab, check the 'BPCD connect back' and 'Daemon connection port' settings. If set to 'Random port' or 'Daemon port only', adjust them as appropriate.
To check the client connect options from the command line, use the following command on the master server. A zero (0) in the second field or a two (2) in the third field indicates that PBX and vnetd are not being used.
<install_path>/netbackup/bin/admincmd/bpclient -L -client myclient
Connect options: 2 2 3
Note: You DO NOT have to stop and restart when making changes to the client attribute.
Second, check for Server Connect Options.
If someone has configured a Server Connect Option for the destination host, it will override the Client Connect Options. This will need to be checked, and potentially corrected, on each host that connects to the destination host. Again, a value of zero (0) in the second field or a two (2) in the third field will cause NetBackup to skip using PBX and vnetd.
CONNECT_OPTIONS = dest_host 0 1 0
Lastly, check for Default Connect Options.
If neither server nor client connect options are configured, then check if default connect options are configured. This will need to be checked, and potentially corrected, on each host that connects to the destination host. Again, a value of zero (0) in the second field or a two (2) in the third field will cause NetBackup to skip using PBX and vnetd.
DEFAULT_CONNECT_OPTIONS = 0 1 0