Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.

how to add master in clustered nbu 7.5

Created: 23 Jan 2013 • Updated: 24 Jan 2013 | 44 comments
This issue has been solved. See solution.

i have a new 2 node master that i did the installation for nbu 7.5. we have vcs 5.1 for clustering. and os is rhel6. and 4 other media server running the same os.

cluster name is: nbu

node1: master01

node2: master02

[root@master02]# cat bp.conf
SERVER = nbu.domain.com
SERVER = media01.domain.com
SERVER = media02.domain.com
SERVER = media03.domain.com
SERVER = media04.domain.com
SERVER = master01.domain.com
SERVER = master02.domain.com
CLIENT_NAME = master02.domaincom
CLUSTER_NAME = nbu.domain.com
CONNECT_OPTIONS = localhost 1 0 2
USE_VXSS = PROHIBITED
VXSS_SERVICE_TYPE = INTEGRITYANDCONFIDENTIALITY
EMMSERVER = nbu.domain.com
HOST_CACHE_TTL = 3600
VXDBMS_NB_DATA = /opt/VRTSnbu/db/data
KMS_DIR = /opt/VRTSnbu/kms
TELEMETRY_UPLOAD = NO

the problem is: i do not see node2 of the master listed in the nbemmcmd.

[root@master02]# ./nbemmcmd -listhosts
NBEMMCMD, Version: 7.5
The following hosts were found:
server           nbu.domain.com
cluster          nbu.domain.com
master           master01.domain.com
Command completed successfully.

the other thing i did was the installation of media servers first. i thought adding via nbemmcmd as media server would be enough or i need to reinstall since the new master server is ready now? as of now i do not see any media servers in any server's nbemmcmd.

[root@master02 bin]# ./hastatus -summary

-- SYSTEM STATE
-- System               State                Frozen

A  master01           RUNNING              0
A  master02           RUNNING              0

-- GROUP STATE
-- Group           System               Probed     AutoDisabled    State

B  ClusterService  master01           Y          N               OFFLINE
B  ClusterService  master02           Y          N               ONLINE
B  nbu_group       master01           Y          N               ONLINE
B  nbu_group       master02           Y          N               OFFLINE

 moreover, how do i verify that this clustered enviroment is all set and ready to work? i mean i was thinking to check the failover, i can initiate a backup, and then shutdown the services on 1 node, and then see if the backups keep running. is this the only check?

do you need for information about anything? this system hasnt gone production yet, its all setting up new. and we're going to migrate the poclicy stuff after we are done.

Comments 44 CommentsJump to latest comment

mph999's picture

The nbemmcmd -listhosts should look something like this

 

server          rdgv21-22     <<<< virtual name / cluster name
app_cluster     app-cluster-test
ndmp            rdgv21-22
ndmp            rdgf270c-01
cluster         rdgv21-22          <<<<< virtual name / cluster name
media           qtpdmedia
master          rdgv240sol22    <<<<< node 2 of cluster
master          rdgv240sol21    <<<<< node 1 of cluster
 
As you say, the other node is not apparently configured.
 
What guide did you use, and did you follow the steps exactly.  For example, you have to enable rsh between the two nodes when installing (can be disabled afterwards).
 
Martin

 

Regards,  Martin
 
Setting Logs in NetBackup:
http://www.symantec.com/docs/TECH75805
 
16ris10's picture

yes I did. i followed the exact same procedure as mentioned in the guide. the only thing missing in the netbackup guide was the VCS part of the installation. i mean everything else is mentioned step by step. i did feel the installation on the second node was kind of quick.

so whats the suggestion now? how about adding it from nbemmcmd as a master server, i mean this node2 and then updaing it via the -updatehost in some way which am not aware of fully.. 

16ris10's picture

if node 2of clustered master nbu.domain.com is to be added via nbemmcmd: 

cluster name: nbu.domain.com

node2: master02.domain.com

would the command be:

nbemmcmd -addhost -clustername nbu.domain.com -machinename master02.domain.com -operatingsystem linux

or some more parameters to be passed?

and since media servers were installed first, can bp.conf update and nbemmcmd command be ok like this?

nbemmcmd -addhost -machinename media01.domain.com -machinetype media -masterserver nbu.domain.com -operatingsystem linux

16ris10's picture

ok, i added the media servers already. now please tell me about adding master server's node2.

16ris10's picture

ok, i added the master too in the following way:

[root@master01]# ./nbemmcmd -addhost -clustername nbu.domain.com -machinename master02.domain.com -machinetype master -netbackupversion 7.5 -operatingsystem linux
NBEMMCMD, Version: 7.5
Command completed successfully.

and the output came out to be:

[root@master01]# ./nbemmcmd -listhosts
NBEMMCMD, Version: 7.5
The following hosts were found:
server           nbu.domain.com
cluster          nbu.domain.com
master           master01.domain.com
media            media01.domain.com
media            media02.domain.com
media            media04.domain.com
media            media03.domain.com
master           master02.domain.com
Command completed successfully.

am i all good now?

 

mph999's picture

Just typed a long reply that didn't save.  Here is the slightly shorter version.

In a word, no I do not think so.

To add a node to a cluster you should use the script:

 /usr/openv/netbackup/bin/cluster/util/cluster_add_node

But this post from Marianne shows it is not that simple ;

https://www-secure.symantec.com/connect/forums/add...

I checked with a very senior BL colleague, and they have the same concern.

We could be wrong, and if we (or rather I am) then sorry, it means you lose a little time.  If I am right, you risk unseen issues on a live server.  I had a case a while ago with a mis-configured cluster, it looked perfect until it was upgraded, then this failed and caused all sorts of issues.  The cause was traced to the fact it was built incorrectly.

The cluster install should have worked, it didn't which shows something is wrong, and until that something is found I cannot say yes (or no).

Martin

 

 

Regards,  Martin
 
Setting Logs in NetBackup:
http://www.symantec.com/docs/TECH75805
 
16ris10's picture

sir thanks for your time, i really appreciate it. how do we know we have a problem with the node being not in the cluster. this was a new install and i think the node is in the cluster. its just the netbackup configuration im having with. i updated the emm and it came out fine. do you know how to check if the node is in the cluster or not? any command to check this? im very positive that there isnt any problem with the node not being in the cluster and then following the procedure to add it back again. 

mph999's picture

No problem, happy to help.

It depends on:

1.  If there really is a problem (as I said, I could be wrong on this)

2. If I am right, and there is an issue, a probelm could be anywhere, and so virtually impossible to find, hence why I can only recommend to get the thing reinstalled and working as per the install guide (which may require a call to support).

The last case I saw like this, as mentioned, the cluster looked right, it seems to run ok for ages, but when upgraded it broke and data was lost.

Eddited to add ...

For the kind of possible issue I'm thinking of, you're not going to find it by running commands.  The example one I mentioned was found by 'manually' looking through the NBDB db unload, but this was after seeing the symptoms of the failure.  Without a known problem, you are looking for something that may not be there, problem is, if you get the problem, it could be too late.

I'm not overly concerned about adding a new node - that can be done, just a matter of confirming the exacct procedure.  I'm more concerned about why it didn't work, something somewhere is wrong, and it is this 'unknown' that could cause you major issues.

It is late now, I need to get some sleep so will have to go.  Please take my advice, bin this and reinstall with supports help if necessary.  I see too  many cases like this where some thing is wrong, and what with all good intentions turns out to be an incorrect workaround.

My way, you will end up with a correct, supported system.

Any other way, you might not.  I cannot recommend you take that risk with a producion system.  If it was only a pure test system, then sure, do whatever you like, it doesnlt matter if a test system breaks, but live production is another matter.

Martin

 

Regards,  Martin
 
Setting Logs in NetBackup:
http://www.symantec.com/docs/TECH75805
 
Yasuhisa Ishikawa's picture

There are several point to check if the clustered master is configured correctly.

1. Nodes are added in the service group named nbu_group.

2. In bp.conf, CLUSTER_NAME and EMMSERVER are set to VIP hostname, and node names are listed as SERVER.

SERVER = nbu.domain.com
SERVER = master01.domain.com
SERVER = master02.domain.com
CLIENT_NAME = master01.domain.com
CLUSTER_NAME = nbu.domain.com
EMMSERVER = nbu.domain.com

3. Cluster name and node names are listed in output of nbemmcmd.

# nbemmcmd -listhosts
NBEMMCMD, Version: 7.5.0.3
The following hosts were found:
server           nbu.domain.com
cluster          nbu.domain.com
master           master01.domain.com
master           master02.domain.com
Command completed successfully
# nbemmcmd -getemmserver
NBEMMCMD, Version: 7.5.0.3
These hosts were found in this domain: nbu.domain.com, master01.domain.com, master02.domain.com

Checking with the host "nbu.domain.com"...
Checking with the host "master01.domain.com"...
Checking with the host "master02.domain.com"...

Server Type    Host Version        Host Name                     EMM Server     
MASTER         7.5                 nbu.domain.com                nbu.domain.com
MASTER         7.5                 master01.domain.com           nbu.domain.com
MEDIA          7.5                 master02.domain.com           nbu.domain.com
Command completed successfully.

4. nodes are listed in /usr/openv/netbackup/bin/cluster/NBU_RSP

#DO NOT DELETE OR EDIT THIS FILE!!!
NBU_GROUP=nbu_group
SHARED_DISK=/opt/VRTSnbu
NODES=master01.domain.com  master02.domain.com
VNAME=nbu.domain.com
VIRTUAL_IP=xxx.xxx.xxx.xxx
SUBNET=xxx.xxx.xxx.xxx
CLUTYPE=VCS
START_PROCS=NB_dbsrv nbevtmgr nbemm nbrb ltid vmd bpcompatd nbjm nbpem nbstserv nbrmms nbsl nbvault nbsvcmon bpdbm bprd bptm bpbrmds bpsched bpcd bpversion bpjobd nbproxy vltcore acsd tl8cd odld tldcd tl4d tlmd tshd rsmd tlhcd pbx_exchange nbkms nbaudit nbatd nbazd nbim

PRODUCT_CODE=NBU
DIR=netbackup mkdir
DIR=netbackup/db mv
DIR=var mkdir
DIR=var/global mv
DIR=volmgr/mkdir
DIR=volmgr/misc mkdir
DIR=volmgr/misc/robotic_db mv
DIR=kms mv
DIR=netbackup/vault mkdir
DIR=netbackup/vault/sessions mv

LINK=volmgr/misc/robotic_db
LINK=netbackup/db
LINK=netbackup/vault/sessions
LINK=var/global
PROBE_PROCS=nbevtmgr nbstserv vmd bprd bpdbm nbpem nbjm nbaudit nbsl nbrmms nbemm nbrb NB_dbsrv

### ADDED ###

5. "tpconfig -emm_dev_list" shows cluster name and node names

# tpconfig -emm_dev_list
     :
==============================================================================
NBU Cluster:                    nbu.domain.com
==============================================================================
Master Server:                  master01.domain.com
NetBackup Version:              7.5.0.3(750300)
Host OperatingSystem:           16
MachineState:                   ACTIVE
==============================================================================
Master Server:                  master02.domain.com
NetBackup Version:              7.5.0.3(750300)
Host OperatingSystem:           16
MachineState:                   OFFLINE
==============================================================================
EMM Server:                     nbu.domain.com

Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan

16ris10's picture

Ok, here are the answers to your 4 points check system.

1. Need to check, will get back in few minutes, not sure how to check this.

2. Yes they are.

[root@master01 netbackup]# cat bp.conf
SERVER = nbu.domain.com
SERVER = media01.domain.com
SERVER = media02.domain.com
SERVER = media03.domain.com
SERVER = media04.domain.com
SERVER = master01.domain.com
SERVER = master02.domain.com
CLUSTER_NAME = nbu.domain.com
CLIENT_NAME = master01.domain.com
CONNECT_OPTIONS = localhost 1 0 2
USE_VXSS = PROHIBITED
VXSS_SERVICE_TYPE = INTEGRITYANDCONFIDENTIALITY
EMMSERVER = nbu.domain.com
HOST_CACHE_TTL = 3600
VXDBMS_NB_DATA = /opt/VRTSnbu/db/data
KMS_DIR = /opt/VRTSnbu/kms
LIST_FS_IMAGE_HEADERS = NO
TELEMETRY_UPLOAD = NO
root@master02 netbackup]# cat bp.conf
SERVER = nbu.domain.com
SERVER = media01.domain.com
SERVER = media02.domain.com
SERVER = media03.domain.com
SERVER = media04.domain.com
SERVER = master01.domain.com
SERVER = master02.domain.com
CLUSTER_NAME = nbu.domain.com
CLIENT_NAME = master02.domain.com
CONNECT_OPTIONS = localhost 1 0 2
USE_VXSS = PROHIBITED
VXSS_SERVICE_TYPE = INTEGRITYANDCONFIDENTIALITY
EMMSERVER = nbu.domain.com
HOST_CACHE_TTL = 3600
VXDBMS_NB_DATA = /opt/VRTSnbu/db/data
KMS_DIR = /opt/VRTSnbu/kms
LIST_FS_IMAGE_HEADERS = NO
TELEMETRY_UPLOAD = NO

3. Yes they are:

[root@master01]# ./nbemmcmd -listhosts
NBEMMCMD, Version: 7.5
The following hosts were found:
server           nbu.domain.com
cluster          nbu.domain.com
master           master01.domain.com
media            media01.domain.com
media            media02.domain.com
media            media04.domain.com
media            media03.domain.com
master           master02.domain.com
Command completed successfully.

Check this emm server output:

 

[root@master01 admincmd]# ./nbemmcmd -getemmserver
NBEMMCMD, Version: 7.5
These hosts were found in this domain: media01.domain.com, media02.domain.com, media03.domain.com, media04.domain.com, master01.domain.com, master02.domain.com, nbu.domain.com

Checking with the host "media01.domain.com"...
Checking with the host "media02.domain.com"...
Checking with the host ""media03.domain.com"...
Checking with the host "media04.domain.com"...
Checking with the host "master01.domain.com"...
Checking with the host "master02.domain.com"...
Checking with the host "nbu.domain.com"...

Server Type    Host Version        Host Name                     EMM Server
MEDIA          7.5                 media01.domain.com            nbu.domain.com
MEDIA          7.5                 media02.domain.com            nbu.domain.com
MEDIA          7.5                 media03.domain.com           nbu.domain.com
MEDIA          7.5                 media04.domain.com            nbu.domain.com
MASTER         7.5                 master01.domain.com           nbu.domain.com
MEDIA          7.5                 master02.domain.com           nbu.domain.com
MASTER         7.5                 nbu.domain.com            nbu.domain.com

Command completed successfully.
16ris10's picture

Ok, here are the answers to your 4 points check system.

1. Need to check, will get back in few minutes, not sure how to check this.

2. Yes they are.

[root@master01 netbackup]# cat bp.conf
SERVER = nbu.domain.com
SERVER = media01.domain.com
SERVER = media02.domain.com
SERVER = media03.domain.com
SERVER = media04.domain.com
SERVER = master01.domain.com
SERVER = master02.domain.com
CLUSTER_NAME = nbu.domain.com
CLIENT_NAME = master01.domain.com
CONNECT_OPTIONS = localhost 1 0 2
USE_VXSS = PROHIBITED
VXSS_SERVICE_TYPE = INTEGRITYANDCONFIDENTIALITY
EMMSERVER = nbu.domain.com
HOST_CACHE_TTL = 3600
VXDBMS_NB_DATA = /opt/VRTSnbu/db/data
KMS_DIR = /opt/VRTSnbu/kms
LIST_FS_IMAGE_HEADERS = NO
TELEMETRY_UPLOAD = NO
root@master02 netbackup]# cat bp.conf
SERVER = nbu.domain.com
SERVER = media01.domain.com
SERVER = media02.domain.com
SERVER = media03.domain.com
SERVER = media04.domain.com
SERVER = master01.domain.com
SERVER = master02.domain.com
CLUSTER_NAME = nbu.domain.com
CLIENT_NAME = master02.domain.com
CONNECT_OPTIONS = localhost 1 0 2
USE_VXSS = PROHIBITED
VXSS_SERVICE_TYPE = INTEGRITYANDCONFIDENTIALITY
EMMSERVER = nbu.domain.com
HOST_CACHE_TTL = 3600
VXDBMS_NB_DATA = /opt/VRTSnbu/db/data
KMS_DIR = /opt/VRTSnbu/kms
LIST_FS_IMAGE_HEADERS = NO
TELEMETRY_UPLOAD = NO

3. Yes they are:

[root@master01]# ./nbemmcmd -listhosts
NBEMMCMD, Version: 7.5
The following hosts were found:
server           nbu.domain.com
cluster          nbu.domain.com
master           master01.domain.com
media            media01.domain.com
media            media02.domain.com
media            media04.domain.com
media            media03.domain.com
master           master02.domain.com
Command completed successfully.

Check this emm server output:

 

[root@master01 admincmd]# ./nbemmcmd -getemmserver
NBEMMCMD, Version: 7.5
These hosts were found in this domain: media01.domain.com, media02.domain.com, media03.domain.com, media04.domain.com, master01.domain.com, master02.domain.com, nbu.domain.com

Checking with the host "media01.domain.com"...
Checking with the host "media02.domain.com"...
Checking with the host ""media03.domain.com"...
Checking with the host "media04.domain.com"...
Checking with the host "master01.domain.com"...
Checking with the host "master02.domain.com"...
Checking with the host "nbu.domain.com"...

Server Type    Host Version        Host Name                     EMM Server
MEDIA          7.5                 media01.domain.com            nbu.domain.com
MEDIA          7.5                 media02.domain.com            nbu.domain.com
MEDIA          7.5                 media03.domain.com           nbu.domain.com
MEDIA          7.5                 media04.domain.com            nbu.domain.com
MASTER         7.5                 master01.domain.com           nbu.domain.com
MEDIA          7.5                 master02.domain.com           nbu.domain.com
MASTER         7.5                 nbu.domain.com            nbu.domain.com

Command completed successfully.
16ris10's picture

4. Yes, its same as what you've posted but not with the fqdn.

[root@master01 cluster]# cat NBU_RSP
#DO NOT DELETE OR EDIT THIS FILE!!!
NBU_GROUP=nbu_group
SHARED_DISK=/opt/VRTSnbu
NODES=master01  master02
VNAME=nbu.domain.com
VIRTUAL_IP=x.x.x.x
SUBNET=x.x.x.x
CLUTYPE=VCS
START_PROCS=NB_dbsrv nbevtmgr nbemm nbrb ltid vmd bpcompatd nbjm nbpem nbstserv nbrmms nbsl nbvault nbsvcmon bpdbm bprd bptm bpbrmds bpsched bpcd bpversion bpjobd nbproxy vltcore acsd tl8cd odld tldcd tl4d tlmd tshd rsmd tlhcd pbx_exchange nbkms nbaudit nbatd nbazd nbim

PRODUCT_CODE=NBU
DIR=netbackup mkdir
DIR=netbackup/db mv
DIR=var mkdir
DIR=var/global mv
DIR=volmgr/mkdir
DIR=volmgr/misc mkdir
DIR=volmgr/misc/robotic_db mv
DIR=kms mv
DIR=netbackup/vault mkdir
DIR=netbackup/vault/sessions mv

LINK=volmgr/misc/robotic_db
LINK=netbackup/db
LINK=netbackup/vault/sessions
LINK=var/global
PROBE_PROCS=nbevtmgr nbstserv vmd bprd bpdbm nbpem nbjm nbaudit nbsl nbrmms nbemm nbrb NB_dbsrv
16ris10's picture

5. Yes, same output:

 

NBU Cluster:                    nbu.domain.com
==============================================================================
Master Server:                  master01.domain.com
NetBackup Version:              7.5.0(750000)
Host OperatingSystem:           16
MachineState:                   ACTIVE
==============================================================================
Master Server:                  master02.domain.com
NetBackup Version:              7.5.0(750000)
Host OperatingSystem:           16
MachineState:                   OFFLINE
==============================================================================
Media Server:                   media01.domain.com
NetBackup Version:              7.5.0(750000)
Host OperatingSystem:           16
MachineState:                   ACTIVE-DI
==============================================================================
Media Server:                   media02.domain.com
NetBackup Version:              7.5.0(750000)
Host OperatingSystem:           16
MachineState:                   ACTIVE-DI
==============================================================================
Media Server:                   media03.domain.com
NetBackup Version:              7.5.0(750000)
Host OperatingSystem:           16
MachineState:                   ACTIVE-DI
==============================================================================
Media Server:                   media04.domain.com
NetBackup Version:              7.5.0(750000)
Host OperatingSystem:           16
MachineState:                   ACTIVE-DI
==============================================================================
EMM Server:                     nbu.domain.com
mph999's picture

Yasuhisa makes a good post, it checks the basic config is correct - i agree 100%, but this does not confirm that all the details are correct in NBDB for example.  There are no command to check this, it takes someone who is very very knowledgeable about clusters and the NetBackup NBDB.

OK, my final advice on this.

Log a call, explain what has happened, show them this post and ask for BL or Engineering to confirm that this method is safe.

In a nice way, I don't care if i am wrong, I DO care that you have an installation that is confirmed as correct.

Martin

 

Regards,  Martin
 
Setting Logs in NetBackup:
http://www.symantec.com/docs/TECH75805
 
16ris10's picture

thanks for you concern Martin.. but what cycle guy has posted. all my output cofirm those, except in the cluster directory, nodes are not mentioned in the fqdn as cycle guy posted.. i understand. i will see if i can log a case.

Yasuhisa Ishikawa's picture

All configurations seems OK except master02 not being listed in "nbemmcmd -listhosts" in your first post.

I have mistake while editing NBU_RSP example. Nodes are listed by VCS node name, so nodes should be listed in short name - not in FQDN.

Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan

16ris10's picture

yes i actually manually added nbmaster02 by issue nbemmcmd command and adding as media server. i hope this doesn't make any problem. thank a ton for the cofirmation..

Marianne's picture

I am late to this party, but I have learned that if installation is done correctly, there is no need to manually add anything via cmd.

You say that the NBU does not cover the VCS part.
That is 100% correct - VCS manuals cover the VCS part.

So, if the 2 nodes exist as a cluster and VCS commands like 'hastatus -sum' and 'gabconfig -a' shows both nodes in the cluster, then NBU can be installed.

One thing that can cause incorrect cluster install/config is when rsh is not configured between cluster nodes. Although VCS can install and config via ssh, NBU cannot. It needs rsh.
There is a TN that explains a workaround: http://www.symantec.com/docs/TECH160242

If your NBU installation log on node 2 does not report successful joining of the cluster and 'hastatus -sum' does not show both nodes in nbu_group, rather start from scratch and know that you have a cluster working 100% correctly from day 1.

One more thing - even if VCS and NBU was installed 100% correctly, nbemmcmd does not show correct info right away.

I noticed this the last time I installed clustered master server in our lab.
VCS shows correct output, but initially NBU showed this:

 

Both nodes in cluster, active on node1. But look at this:

# /usr/openv/netbackup/bin/admincmd/nbemmcmd -listhosts 
NBEMMCMD, Version:7.1
The following hosts were found:
server             nbumas
cluster            nbumas
master             mvdb-lnx1
Command completed successfully.

Even offline and online of the service group on node 1 did not fix it.
Only after I failed over to node 2 did nbemmcmd show correct info:

 

# /usr/openv/netbackup/bin/admincmd/nbemmcmd -listhosts

NBEMMCMD, Version:7.1 
The following hosts were found: 
server             nbumas
cluster            nbumas
master             mvdb-lnx1
master             mvdb-lnx2 
Command completed successfully.

 

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

16ris10's picture

Marianne, you're not late actually. the issue is still around. but this other guy is spot on with the mount point issue. he asked me to mount and tailed the log. let me post that here. you're right. nbemmcmd wasn't showing node2 in the beginning. but hastatus has always been since the beginning. i started to worry when i did not see nbemmcmd output with thr other master so i added manually.

now let me post that error which has been identifieed and that guy is spot on.

[root@master01 bin]# ./hastatus
attempting to connect....
attempting to connect....connected

group           resource             system               message
--------------- -------------------- -------------------- --------------------
                                     master01           RUNNING
                                     master02           RUNNING
ClusterService                       master01           OFFLINE
ClusterService                       master02           ONLINE
-------------------------------------------------------------------------
nbu_group                            master01           ONLINE
nbu_group                            master02           *FAULTED* OFFLINE
                webip                master01           OFFLINE
                webip                master02           ONLINE
                csgnic               master01           ONLINE
-------------------------------------------------------------------------
                csgnic               master02           ONLINE
                nbu_nic              master01           ONLINE
                nbu_nic              master02           ONLINE
                nbu_ip               master01           ONLINE
                nbu_ip               master02           OFFLINE
-------------------------------------------------------------------------
                nbu_mount            master01           ONLINE
                nbu_mount            master02           *FAULTED*
                nbu_server           master01           ONLINE
                nbu_server           master02           OFFLINE


[root@master01 bin]# ./hamsg Mount_A
Wed 23 Jan 2013 12:17:36 AM UTC VCS INFO V-16-10031-20507 Mount:Mount:imf_init:successfully initialized the VxAMF Mount Module
Wed 23 Jan 2013 12:17:36 AM UTC VCS INFO V-16-2-13805 (imf_init) entry point completed with return status (0)
Thu 24 Jan 2013 03:10:05 AM UTC VCS NOTICE V-16-10031-20704 Mount:Mount:imf_getnotification:Received notification for vxamf-group nbu_mount

[root@master01 bin]# tail -20 /var/VRTSvcs/log/engine_A.log
2013/01/24 03:12:08 VCS ERROR V-16-2-13066 (master02) Agent is calling clean for resource(nbu_mount) because the resource is not up even after              online completed.
2013/01/24 03:12:09 VCS INFO V-16-2-13068 (master02) Resource(nbu_mount) - clean completed successfully.
2013/01/24 03:12:09 VCS INFO V-16-2-13071 (master02) Resource(nbu_mount): reached OnlineRetryLimit(0).
2013/01/24 03:12:09 VCS ERROR V-16-1-54031 Resource nbu_mount (Owner: Unspecified, Group: nbu_group) is FAULTED on sys master02
2013/01/24 03:12:09 VCS NOTICE V-16-1-10300 Initiating Offline of Resource nbu_ip (Owner: Unspecified, Group: nbu_group) on System master02
2013/01/24 03:12:09 VCS INFO V-16-6-15015 (master02) hatrigger:/opt/VRTSvcs/bin/triggers/resfault is not a trigger scripts directory or can no             t be executed
2013/01/24 03:12:10 VCS INFO V-16-1-10305 Resource nbu_ip (Owner: Unspecified, Group: nbu_group) is offline on master02 (VCS initiated)
2013/01/24 03:12:10 VCS ERROR V-16-1-10205 Group nbu_group is faulted on system master02
2013/01/24 03:12:10 VCS NOTICE V-16-1-10446 Group nbu_group is offline on system master02
2013/01/24 03:12:10 VCS INFO V-16-1-10493 Evaluating master01 as potential target node for group nbu_group
2013/01/24 03:12:10 VCS INFO V-16-1-10493 Evaluating master02 as potential target node for group nbu_group
2013/01/24 03:12:10 VCS INFO V-16-1-50010 Group nbu_group is online or faulted on system master02
2013/01/24 03:12:10 VCS NOTICE V-16-1-10301 Initiating Online of Resource nbu_ip (Owner: Unspecified, Group: nbu_group) on System master01
2013/01/24 03:12:10 VCS NOTICE V-16-1-10301 Initiating Online of Resource nbu_mount (Owner: Unspecified, Group: nbu_group) on System master01
2013/01/24 03:12:13 VCS INFO V-16-1-10298 Resource nbu_mount (Owner: Unspecified, Group: nbu_group) is online on master01 (VCS initiated)
2013/01/24 03:12:22 VCS INFO V-16-1-10298 Resource nbu_ip (Owner: Unspecified, Group: nbu_group) is online on master01 (VCS initiated)
2013/01/24 03:12:22 VCS NOTICE V-16-1-10301 Initiating Online of Resource nbu_server (Owner: unknown, Group: nbu_group) on System master01
2013/01/24 03:12:42 VCS INFO V-16-1-10298 Resource nbu_server (Owner: unknown, Group: nbu_group) is online on master01 (VCS initiated)
2013/01/24 03:12:42 VCS NOTICE V-16-1-10447 Group nbu_group is online on system master01
2013/01/24 03:12:42 VCS NOTICE V-16-1-10448 Group nbu_group failed over to system master01

 

Yasuhisa Ishikawa's picture

Unfortunately we need messages in engina_A.log several lines before you pasted.

BTW, does /opt/VRTSnbu directory exist on master02? if not, create it and retry. Before retrying, you need to clear FAULTED flag of nbu_group service group by "hagrp -clear nbu_group".

Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan

16ris10's picture

ok, give me 5 mins. i'll post the whole log. and remove the faulted thing too.

16ris10's picture

here you go with the complete engine log.

AttachmentSize
engine_log_A.txt 51.56 KB
16ris10's picture

yes the directory does exist. here's the output:

 

[root@master02 VRTSnbu]# pwd
/opt/VRTSnbu
[root@master02 VRTSnbu]# ls -l
total 8
drwxr-xr-x 3 root bin 4096 Jan 23 00:45 db

[root@master01 VRTSnbu]# pwd
/opt/VRTSnbu
[root@master01 VRTSnbu]# ls -l
total 0
drwxr-xr-x 4 root bin  96 Jan 23 00:21 db
drwxr-xr-x 2 root root 96 Jan 23 00:19 kms
drwxr-xr-x 2 root root 96 Jan 21 23:32 lost+found
drwxr-xr-x 4 root root 96 Jan 23 00:19 netbackup
drwxr-xr-x 3 root root 96 Jan 23 00:19 var
drwxr-xr-x 3 root root 96 Jan 23 00:19 volmgr
 

Yasuhisa Ishikawa's picture
2013/01/24 03:10:06 VCS NOTICE V-16-1-10446 Group nbu_group is offline on system master01
2013/01/24 03:10:06 VCS NOTICE V-16-1-10301 Initiating Online of Resource nbu_ip (Owner: Unspecified, Group: nbu_group) on System master02
2013/01/24 03:10:06 VCS NOTICE V-16-1-10301 Initiating Online of Resource nbu_mount (Owner: Unspecified, Group: nbu_group) on System master02
2013/01/24 03:10:06 VCS ERROR V-16-10031-5517 (master02) Mount:nbu_mount:online:Block device /dev/vx/dsk/netbackup_dg/netbackup-dbvol does not exist
2013/01/24 03:10:18 VCS INFO V-16-1-10298 Resource nbu_ip (Owner: Unspecified, Group: nbu_group) is online on master02 (VCS initiated)
2013/01/24 03:12:08 VCS ERROR V-16-2-13066 (master02) Agent is calling clean for resource(nbu_mount) because the resource is not up even after online completed.
2013/01/24 03:12:09 VCS INFO V-16-2-13068 (master02) Resource(nbu_mount) - clean completed successfully.
2013/01/24 03:12:09 VCS INFO V-16-2-13071 (master02) Resource(nbu_mount): reached OnlineRetryLimit(0).
2013/01/24 03:12:09 VCS ERROR V-16-1-54031 Resource nbu_mount (Owner: Unspecified, Group: nbu_group) is FAULTED on sys master02

Despite you have configured shared disk with VxVM, no DiskGroup and Volume resource exist in nbu_group.
You need to add DiskGroup and Volume resource. Please give me 10 minutes.

Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan

16ris10's picture

thank you. right on. and just for you information:

[root@master01 opt]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg00-rootvol
                       49G  7.8G   39G  17% /
/dev/mapper/vg00-tmpvol
                      496M  263M  208M  56% /tmp
/dev/mapper/vg00-homevol
                      248M   11M  226M   5% /home
/dev/mapper/vg00-varvol
                       50G  7.9G   39G  17% /var
/dev/mapper/vg00-crashvol
                      2.7G   69M  2.5G   3% /var/crash
/dev/cciss/c0d0p1     251M   38M  201M  16% /boot
tmpfs                  16G     0   16G   0% /dev/shm
tmpfs                 4.0K     0  4.0K   0% /dev/vx
/dev/vx/dsk/netbackup_dg/netbackup-dbvol
                      500G  430M  469G   1% /opt/VRTSnbu
[root@master02 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg00-rootvol
                       54G  7.7G   43G  16% /
/dev/mapper/vg00-varvol
                       50G  3.6G   44G   8% /var
/dev/mapper/vg00-crashvol
                      2.7G   69M  2.5G   3% /var/crash
/dev/mapper/vg00-tmpvol
                      496M   22M  449M   5% /tmp
/dev/mapper/vg00-homevol
                      248M   11M  226M   5% /home
/dev/cciss/c0d0p1     251M   35M  204M  15% /boot
tmpfs                  16G     0   16G   0% /dev/shm

 

 

 

 

Yasuhisa Ishikawa's picture

Add DiskGroup and Volume resource as below. Then retry.

# haconf -makerw
# hares -add nbu_dg DiskGroup nbu_group
# hares -modify nbu_dg DiskGroup netbackup_dg
# hares -modify nbu_dg StartVolumes 0
# hares -modify nbu_dg StopVolumes 0
# hares -modify nbu_dg Enabled 1
# hares -add nbu_vol Volume nbu_vol
# hares -modify nbu_vol DiskGroup netbackup_dg
# hares -modify nbu_vol Volume netbackup-dbvol
# hares -modify nbu_vol Enabled 1
# hares -link nbu_mount nbu_vol
# hares -link nbu_vol nbu_dg
# haconf -dump -makero

Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan

SOLUTION
16ris10's picture

error:

 

[root@master01 bin]# ./hares -add nbu_dg DiskGroup nbu_group
VCS NOTICE V-16-1-10242 Resource added. Enabled attribute must be set before agent monitors
Yasuhisa Ishikawa's picture

This is a notice. Proceed!

Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan

16ris10's picture

this is not an error too?

[root@master01 bin]# ./hares -add nbu_vol Volume nbu_vol
VCS WARNING V-16-1-10133 Group does not exist: nbu_vol
Yasuhisa Ishikawa's picture

It' my mistake.

Run "hares -add nbu_vol Volume nbu_group" instead.

Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan

16ris10's picture

just this line am assuming, or if i need to do whole thing over, please let me know. for now just doing the above part..

16ris10's picture

i think the whole procedure had this crying baby nbu_vol

 

[root@master01 bin]# ./haconf -makerw
[root@master01 bin]# ./hares -add nbu_dg DiskGroup nbu_group
VCS NOTICE V-16-1-10242 Resource added. Enabled attribute must be set before agent monitors
[root@master01 bin]# ./hares -modify nbu_dg DiskGroup netbackup_dg
[root@master01 bin]# ./hares -modify nbu_dg StartVolumes 0
[root@master01 bin]# ./hares -modify nbu_dg StopVolumes 0
[root@master01 bin]# ./hares -modify nbu_dg Enabled 1
[root@master01 bin]# ./hares -add nbu_vol Volume nbu_vol
VCS WARNING V-16-1-10133 Group does not exist: nbu_vol
[root@master01 bin]# ./hares -modify nbu_vol DiskGroup netbackup_dg
VCS WARNING V-16-1-10260 Resource does not exist: nbu_vol
[root@master01 bin]# ./hares -modify nbu_vol Volume netbackup-dbvol
VCS WARNING V-16-1-10260 Resource does not exist: nbu_vol
[root@nbmaster01 bin]# ./hares -modify nbu_vol Enabled 1
VCS WARNING V-16-1-10260 Resource does not exist: nbu_vol
[root@master01 bin]# ./hares -link nbu_mount nbu_vol
VCS WARNING V-16-1-10249 Child resource does not exist: nbu_vol
[root@master01 bin]# ./hares -link nbu_vol nbu_dg
VCS WARNING V-16-1-10260 Resource does not exist: nbu_vol
[root@master01 bin]# ./haconf -dump -makero

ddid this, now proceeding to that deletion part..

Yasuhisa Ishikawa's picture

So run these lines.

# haconf -makerw
# hares -add nbu_vol Volume nbu_group
# hares -modify nbu_vol DiskGroup netbackup_dg
# hares -modify nbu_vol Volume netbackup-dbvol
# hares -modify nbu_vol Enabled 1
# hares -link nbu_mount nbu_vol
# hares -link nbu_vol nbu_dg
# haconf -dump -makero

Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan

16ris10's picture

now that went smooth. all followed. plus we don't have to remove any part from the fstab right.. i had cleard the mount from hastatus..

Yasuhisa Ishikawa's picture

In addition, remove shared disk entry from /etc/vfstab on each nodes. Shared disk must not be mounted in system startup.

Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan

16ris10's picture

ok, i can remove from the other nodes. but what entry exactly? :(.

Yasuhisa Ishikawa's picture

Sorry /etc/vfstab does not exist in Linux.

If the line like below exists in /etv/fstab, remove it.

/dev/vx/dsk/netbackup_dg/netbackup-dbvol           /opt/VRTSnbu      vxfs     defaults  0  0

Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan

16ris10's picture

no that entry does not exist. pasting the contents of that file here below.

also please see above, the procedure you posted to add, it gave several warnings about nbu_vol.

[root@master01 etc]# cat fstab
/dev/vg00/rootvol       /                       ext3    defaults        1 1
/dev/vg00/tmpvol        /tmp                    ext3    defaults        1 2
/dev/vg00/homevol       /home                   ext3    defaults        1 2
/dev/vg00/varvol        /var                    ext3    defaults        1 2
/dev/vg00/crashvol      /var/crash              ext3    defaults        1 2
LABEL=/boot             /boot                   ext3    defaults        1 2
tmpfs                   /dev/shm                tmpfs   defaults        0 0
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
sysfs                   /sys                    sysfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0
/dev/vg00/swapvol       swap                    swap    defaults        0 0

# NOTE: When adding or modifying VxFS or VxVM entries, add '_netdev'
# to the mount options to ensure the filesystems are mounted after VxVM and
# VxFS have started.
16ris10's picture

sir, i have done all what has been recommended till now. do we have more things. or i can test the failover and see what errors it give me now?

Yasuhisa Ishikawa's picture

Yes, you can try to switch nbu_group to master02.
If you got failed again, please post:

  • Output of "vxdisk -o alldgs list" on both master01 and master02
  • Output of "lsmod | grep vxfs" on both master01 and master02
  • engine_A.log

Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan

16ris10's picture

doing that now. let's see. will be back here in 5 mins..

16ris10's picture

OMG dude. that worked. am so so thankful to you. really. :).

although i see the tape drive configuration failing. but i'll create a new topic for that. those two totally goes out to you. even in the console i can see master02. thanks a lot..

mph999's picture

Just checking in ...  seems I missed all the fun.

So, a small config step was found taht was fixable, good.  I did an quick look around, and found what Marianne had posted, the full details do not appear until the cluster is first failed over (seems I had forgotten that minor point ...)

So, seems all is good - excellent, I am pleased.

M

 

Regards,  Martin
 
Setting Logs in NetBackup:
http://www.symantec.com/docs/TECH75805