DOCUMENTATION: How to troubleshoot and correct problems with media servers when the status changes in the Media Servers area of the NetBackup Administration GUI.

Article:TECH69625  |  Created: 2009-01-15  |  Updated: 2009-01-15  |  Article URL http://www.symantec.com/docs/TECH69625
Article Type
Technical Solution


Environment

Issue



DOCUMENTATION: How to troubleshoot and correct problems with media servers when the status changes in the Media Servers area of the NetBackup Administration GUI.

Solution



Manual:  Veritas NetBackup (tm) 6.5. Troubleshooting Guide for UNIX, Linux and Windows

Modification Type:  Addition

Modification:  Administrators may note that NetBackup media servers change state from "Available for Disk and Tape"  to "Available to Disk" or "Offline".  

An example of this area of the GUI with media servers showing "Offline" is below:
 

This may occur if there are multiple interfaces on the master server that the media server can't resolve.  The steps to troubleshoot and correct this issue are as follows:


Example environment:
Clustered master server with the following /etc/hosts entries:
x.x.x.a    mastnode1    mastnode1.local
x.x.x.b    mastnode2    mastnode2.local
x.x.x.c    mastclust

The media server's hosts file does not have the '.local' name:
x.x.x.a    mastnode1  
x.x.x.b    mastnode2  
x.x.x.c    mastclust



Complete the following on the master server:

1. Increase debug and diagnostic levels of nbemm, corba and the NetBackup libraries to 6 by running the following commands:
/usr/openv/netbackup/bin/vxlogcfg -a -p 51216 -o 156 -s DiagnosticLevel=6 -s DebugLevel=6
/usr/openv/netbackup/bin/vxlogcfg -a -p 51216 -o 111 -s DiagnosticLevel=6 -s DebugLevel=6
/usr/openv/netbackup/bin/vxlogcfg -a -p 51216 -o 137 -s DiagnosticLevel=6 -s DebugLevel=6

2. Create the following directory:
/usr/openv/netbackup/logs/admin

3. In the /usr/openv/netbackup/bp.conf, add the following:
   VERBOSE = 5

4. Stop and start nbemm (this may be done while backups are running):
  • /usr/openv/netbackup/bin/nbemm -terminate
  • Make sure it's down  by running bpps -a | grep nbemm.  If needed, repeat the previous nbemm -terminate command until the process is down.
  • To restart the process run:   /usr/openv/netbackup/bin/nbemm


Complete the following on the media server:
5. Increase debug and diagnostic levels of nbemm and corba to 6 by running the following commands:
/usr/openv/netbackup/bin/vxlogcfg -a -p 51216 -o 156 -s DiagnosticLevel=6 -s DebugLevel=6
/usr/openv/netbackup/bin/vxlogcfg -a -p 51216 -o 111 -s DiagnosticLevel=6 -s DebugLevel=6

6. Create the following directories:
/usr/openv/volmgr/debug
/usr/openv/volmgr/debug/daemon

7. Create a /usr/openv/volmgr/vm.conf file and add VERBOSE to the file.


8.  Reproduce the issue.


Review of the resulting /usr/openv/volmgr/debug/daemon log should show the incoming connection from the master server on the primary interface, in this example "mastclust":
13:29:10.179 [14608.19392] <2> emmlib_initialize: (-) Connection attempt #<0>
13:29:10.179 [14608.19392] <4> ValidateConnectionID: (-) Created new Connection ID 0
13:29:10.179 [14608.19392] <2> emmlib_initializeEx: (-) Connecting to the Server <mastclust> Port <1556>,

The connect back was made to the sending IP address, which is the IP associated with "mastnode2":
13:29:10.194 [14608.19392] <2> TAO: TAO (14608|19392) - PBXIOP_Connector::make_connection, to <x.x.x.c:1556:EMM>
13:29:10.210 [14608.19392] <2> TAO: TAO (14608|19392) PBXIOP connection to peer <x.x.x.c:1556> on 612

When the connect back is received by the master server, it is returning the "mastnode2.local" name, which is unknown to the media server:
13:29:10.210 [14608.19392] <2> TAO: TAO (%P|%t) - Transport::handle_input(): bytes read from socket - HEXDUMP 528 bytes
47 49 4f 50 01 00 00 01 00 00 02 04 00 00 00 00 GIOP............
00 00 00 01 00 00 00 03 00 00 00 1b 49 44 4c 3a ............IDL:
56 65 72 69 74 61 73 2f 45 4d 4d 2f 53 65 72 76 Veritas/EMM/Serv
65 72 3a 31 2e 30 00 20 00 00 00 01 4f 43 49 01 er:1.0. ....OCI.
00 00 01 cc 00 01 02 00 00 00 00 10 73 63 74 63 ...I........mast -----> packet information with name
6c 62 6b 30 32 2e 6c 6f 63 61 6c 00 06 14 3a e8 node2.local...:¿ -----> packet information with name continued
This name was not resolvable on the media server as an alias of the master, so the media server doesn't respond to the master query.  This results in the media server changing state and backups not running to the storage units associated with the media server.


In the above example adding a mastnode2.local and mastnode1.local interface to the media server /etc/hosts file resolved the issue and allowed the media servers to stay "Active for Disk and Tape".






Legacy ID



323092


Article URL http://www.symantec.com/docs/TECH69625


Terms of use for this information are found in Legal Notices