Video Screencast Help

OpsCenter not connected to master server

Created: 28 Jan 2012 • Updated: 08 Feb 2012 | 8 comments
This issue has been solved. See solution.

Hi all,

I'm new to OpsCenter and I'm having difficulties making it communicate with my master server. Current setup:

- master is 6.5.3.1, running on Solaris 9, 32-bit; no media servers in my config

- I've installed OpsCenter 7.0 on a Windows 2003 machine, 32-bit

All went fine with the installation, I had some  trouble starting Tomcat but I managed to bypass those and now I have a fully functional server and a nice, clean GUI up and running. What I don't have is a functional connection between OpsCenter and the master. Here's what I did, chronologically:

  1. installed OpsCenter, ViewBuilder, ICS and Agent on the Windows host
  2. checked host2host connectivity (bpclntcmd -ip/hn are OK, ping and telnet on port 1556 work fine)
  3. checked if the master has the windows host in the "Servers" list -> OK
  4. checked if /opt/VRTSpbx/bin/pbx-exchange is running on the master -> OK
  5. checked if nbsl daemon is running on the master -> OK
  6. "nbregopsc -add" isn't recognized on the master, I assume it only works in NBU 7.0 and above
  7. in terms of agent configuration, I've created one from the GUI (since the master is 6.5) yet since OC is not connected, nothing gets collected
  8. I've restarted the OC server multiple times
  9. checked nbproxy, which seems OK too:

bpps -x | grep nbproxy
    root  4387 26800  0   Jan 20 ?        0:00 sh -c "/usr/openv/netbackup/bin/nbproxy" dblib nbpem_cleanup
    root  4388  4387  0   Jan 20 ?        0:01 /usr/openv/netbackup/bin/nbproxy dblib nbpem_cleanup
    root  3920  3918  0   Jan 20 ?        0:08 /usr/openv/netbackup/bin/nbproxy dblib nbpem_email
    root 26806 26800  0   Jan 20 ?        0:00 sh -c "/usr/openv/netbackup/bin/nbproxy" dblib nbpem
    root 26792 26791  0   Jan 20 ?        0:11 /usr/openv/netbackup/bin/nbproxy dblib nbjm
    root  3918 26800  0   Jan 20 ?        0:00 sh -c "/usr/openv/netbackup/bin/nbproxy" dblib nbpem_email
    root 14478 14477  0 18:27:42 ?        0:01 /usr/openv/netbackup/bin/nbproxy libminlic -mgrIORFile -LicenseManager-3.ior.mg
    root 26791 26787  0   Jan 20 ?        0:00 sh -c "/usr/openv/netbackup/bin/nbproxy" dblib nbjm
    root 14477 26862  0 18:27:42 ?        0:00 sh -c "/usr/openv/netbackup/bin/nbproxy" libminlic -mgrIORFile -LicenseManager-
    root 26807 26806  0   Jan 20 ?        4:36 /usr/openv/netbackup/bin/nbproxy dblib nbpem

I'm guessing I'm doing something wrong here, and I suspect the fact that the agent needs to be installed actually on the master server, NOT on the OC host. If so, please confirm, this particularly important aspect is not clearly stated in the OC admin guide (but I've pretty much given up hopes of seeing proper English and a decent logical flow of ideas in Symantec's documentation, after stumbling upon this: "Either delete the user backup schedule that is a copy from a full or incremental schedule that has the calendar base schedule and recreate the user backup schedule or modify the existing user backup schedule by changing the backup type to full backup and then de-select the calendar base and change it back to user backup type schedule." If someone can translate that, I'd be really grateful...).

If my master is running on Solaris I suppose I'd need the Solaris agent? Is there any patch I need to apply to my version of OC (7.0)? Any authentication issues between the two servers I should be aware of ? Below is the security.conf file from the Windows OC host:

#Last Updated on
#Fri Aug 14 15:58:26 IST 2009
vxss.portnumber=2821
vxss.pbxportnumber=1556
vxss.hostname=localhost
vxss.password=EQWWPZUwh5o=
vxss.username=admin
vxss.domainName=broker

I've already executed on the OC host the following command, with it being the resource broker, to no avail:

./vssat setuptrust --broker <RootBrokerMachine>:<port> --securitylevel high

Thank you for whatever useful advice you can give me.

Comments 8 CommentsJump to latest comment

Alexander Tueshaus's picture

Hello Chronos,

I've read your chronological list. Sorry for the question maybe you only forgot this point:
Did you add the masterserver in OPScenter? (OPScenter > Settings > Configuration > NetBackup > Add)

Regards,

Alex

Chronos's picture

Obivously I forgot to mention that, sorry. The master is still shown as "Not connected" no matter if I delete it/re-add it from the console.

tom_sprouse's picture

1) The Windows Host (OpsCenter) can be the agent host for the Solaris Master

2) Ensure that OpsCenter / Agent / Java ViewBuilder are all running the same version. (Assume 7.0 or 7.0.1)

3) Ensure that the OpsCenter host is listed in the bp.conf file on the master as a SERVER = opscenter_hostname 

      It should be listed as SERVER = and not MEDIA_SERVER = 

4) Configure the agent in OpsCenter first

  • Log in to OpsCenter
  • Click on Settings / Configuration / Agent
  • Click on Create Agent
  • Enter the Agent Host name (OpsCenter Hostname) 
  •      - Agent OS = Windows Family
  •      - PBX Port 1556
  •      - OpsCenter Network Address : select the drop down and specify the interface that can communicate with the master
  • Click SAVE

 

5) Now configure the Master in OpsCenter

  • Click on Settings / Configuration / NetBackup
  • The master should already be added according to the information provided.
  • Check the Master Server box, and click Disable Data Collection
  • ReCheck the Master Server box, and click the Edit Button
  • In the bottom section of the screen, under Advanced Data Collection Properties set the following
  •     NetBackup Version --- 6.5.x
  •     Agent --- set to OpsCenter Agent Created Above
  •     Install Directory - path to locally installed NBU installation 
  •                      (If the NetBackup Admin Console is not installed on the OpsCenter host, please do so at this time)
  •     Volume Manager Directory --- path to locally installed NBU installation
  •     Username / Password --- set to root account on Solaris Master 
  • Click SAVE
  • Check the Master Server box, and click Enable Data Collection

 

The master should now connect and show online or partially connected.

Please let me know your results, if you still cannot connect, please check your Network route between both hosts, ensure port connectivity (1556), and name resolution for both hosts.

If this resolves you issue, please mark this as a solution

-------------------------------------------------------------------------------------------------------------------------------------------------------

Secondary Issue (Translation)

 

"Either delete the user backup schedule that is a copy from a full or incremental schedule that has the calendar base schedule and recreate the user backup schedule or modify the existing user backup schedule by changing the backup type to full backup and then de-select the calendar base and change it back to user backup type schedule."

What it should reflect: 

Delete the user backup schedule and recreate it manually. 

 

Although it maybe possible to modify and correct the copied User Backup Schedule, it is not recommended.

Technote:  TECH154514 has been updated and should be published shortly --- Thank you!

 

Note: If this post provides you with a solution don't forget to mark the discussion as solved&l

Chronos's picture

Hi Tom, 

Thanks for your efforts, but I think I had done all this the first time... just to follow your steps, I've deleted both the master and the agent to re-create them in the order you mentioned.

1. agent, OpsCenter and JavaBuilder are all 7.0.0.0

2. bp.conf is fine:

 

root@tbackupngn:/ # cat /usr/openv/netbackup/bp.conf
SERVER = tbackupngn
SERVER = dwinipsngn
SERVER = 10.123.3.16
SERVER = tadvupmrngn
#SERVER = tbackupngn-bck
 

 

2. agent is configured as in the attached screenshot (dwinipsngn is the windows host where I've installed OC):

 

4. below you have the master server's configuration:

5. and yet it's still showing not connected, after multiple restarts of the server:

6. The weird thing is that it seems to connect for a very short period of time (you can see "Last contact") and then it seems to disconnect...

7. connectivity on port 1556 is fine between both hosts:

 

root@tbackupngn:/ # telnet dwinipsngn 1556
Trying 10.123.3.16...
Connected to dwinipsngn.
Escape character is '^]'.
^]
telnet> q
Connection to dwinipsngn closed.
 
root@tbackupngn:/ # ping dwinipsngn
dwinipsngn is alive
 
root@tbackupngn:/ # bpclntcmd -hn dwinipsngn
host dwinipsngn: dwinipsngn at 10.123.3.16 (0xa7b0310)
aliases:
root@tbackupngn:/ # bpclntcmd -ip 10.123.3.16
checkhaddr: host   : dwinipsngn: dwinipsngn at 10.123.3.16 (0xa7b0310)
checkhaddr: aliases:
 
From the client-side:
 
C:\Program Files\Symantec\OpsCenter\server\bin>ping tbackupngn
 
Pinging tbackupngn-bck.ngn-ttrg.com [10.123.3.18] with 32 bytes of data:
 
Reply from 10.123.3.18: bytes=32 time<1ms TTL=255
Reply from 10.123.3.18: bytes=32 time<1ms TTL=255
Reply from 10.123.3.18: bytes=32 time<1ms TTL=255
Reply from 10.123.3.18: bytes=32 time<1ms TTL=255
 
Ping statistics for 10.123.3.18:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 0ms, Average = 0ms
 
 
C:\Program Files\Symantec\OpsCenter\server\bin>cd "C:\Program Files\VERITAS\NetBackup\bin"
 
C:\Program Files\VERITAS\NetBackup\bin>bpclntcmd.exe -hn tbackupngn
host tbackupngn: tbackupngn-bck.ngn-ttrg.com at 10.123.3.18 (0x12037b0a)
aliases:     tbackupngn.ngn-ttrg.com
 
 
C:\Program Files\VERITAS\NetBackup\bin>bpclntcmd.exe -ip 10.123.3.18
checkhaddr: host   : tbackupngn-bck: tbackupngn-bck at 10.123.3.18 (0x12037b0a)
checkhaddr: aliases:
Chronos's picture

I can see the images have been trunkated, you can  just copy the image URL and see the whole thing.

Chronos's picture

I telnet-ed from my windows host to the solaris master on 1556 and it's OK:

root@tbackupngn:/ # netstat -an | grep 10.123.3.16 

 

10.123.3.18.1556     10.123.3.16.1919     65535      0 49640      0 ESTABLISHED
 
I have no firewalls in between the hosts, they connect directly:
 
root@tbackupngn:/ # traceroute dwinipsngn
traceroute: Warning: Multiple interfaces found; using 10.123.3.18 @ ce0
traceroute to dwinipsngn (10.123.3.16), 30 hops max, 40 byte packets
1  dwinipsngn (10.123.3.16)  0.277 ms  0.156 ms  0.144 ms
 
 
C:\Documents and Settings\ctamon>tracert tbackupngn
 
Tracing route to tbackupngn-bck.ngn-ttrg.com [10.123.3.18]
over a maximum of 30 hops:
 
  1    <1 ms    <1 ms    <1 ms  tbackupngn-bck [10.123.3.18]
 
Trace complete.
Amaan's picture

I am not sure if this helps, but this is same thing what you did, but you can try it as option.

delete your master server from opscenter web console;

from master server run this command:

bin\admincmd directory nbregopsc -add opscenter server name.

not sure if this helps you.

also you can try to disable and enable the data collection.

Chronos's picture

I really am both pissed off and flabbergasted... who would've thought that the culprit for this whole mess is actually ... Symantec? After two agonizing weeks of searching and digging through documentation, knowing that I had done everything by the boook, it turns out that the Agent executable file was blocked by Symantec's own AV!

This is what I saw in my windows host, this time after doing an install in the production environment and actually getting about 10 m of connectivity (the same setup there too- windows agent host+solaris master):

 

 

SYMANTEC TAMPER PROTECTION ALERT

 

Target:  C:\Program Files\Symantec\OpsCenter\Agent\bin\OpsCenterAgentService.exe

Event Info:  Suspend Thread

Action Taken:  Blocked

Actor Process:  C:\WINDOWS\System32\svchost.exe (PID 1572)

Time:  07 February 2012  21:35:31

 

You might wonder if the AV itself is up to date... it is. You guys might wanna look into this, it's completely and utterly retarded to spend so much time on a simple task that would've taken just half an hour to complete...

SOLUTION