Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.

resource

Created: 30 Sep 2013 • Updated: 03 Oct 2013 | 3 comments
This issue has been solved. See solution.

Hi,

Please I like to verify the following.

In the case that a certain resource is not brought offline,when a group tries to fails over,and if we manually bring the resource offline,the group will start automatically on other node, or I should to start it manually?"

I think that the answer is yes because the rest of the nodes are signalized that those resources are offline on that node, so the group is brought online on the next node in systemlist. right?

Please comment the next thing.

When a resource is faulty(from OS point of view),and after the corresponding agent tried unsuccessfully to bring it offline,the agent use clear entry and if now the resource is brought offline,then this agent talk to had and tell him that this resource is offline.Next this had talk to the hads on the rest of the vcs nodes to signalize this thing.right?

tnx a lot,
marius

Operating Systems:
Discussion Filed Under:

Comments 3 CommentsJump to latest comment

mikebounds's picture

If the offline fails of offline the resource, then the clean will be called and if the clean succeeds in offlining the resource then the group will continue to offline, but if the clean fails too, then the resource will have state of unable to offline with adminstrive interversion required.  At this point VCS will continue to periodically probe resource and the group will still be in a "Waiting to offline" state.  

If you then manually offline the resource from the O/S, then VCS will see this is offline when the agent next does a probe (runs monitor) and then the group will continue to offline.  When the group has offlined it's state will then look for a failover target, so yes it start automatically on the other node, in both cases were clean worked, or clean failed and you had to manually offline resource.

Mike

UK Symantec Consultant in VCS, GCO, SF, VVR, VxAT on Solaris, AIX, HP-ux, Linux & Windows

If this post has answered your question then please click on "Mark as solution" link below

tanislavm's picture

Hi Mike,

Just follow the discussion.

fine.

When on a node in vcs the service or a resource is offline, this had on this node talk to the hads on the rest of the vcs nodes to signalize this thing.right?

tnx a lot,

marius

g_lee's picture

Marius,

Again, covered in the documentation (relevant sections listed, things that you might want to pay attention to for your query in bold):

VCS 6.0.1 (Solaris) Administrator's Guide
Section I. Clustering concepts and terminology -> Introducing Veritas Cluster Server -> Logical components of VCS -> About cluster control, communications, and membership

About the high availability daemon (HAD)
https://sort.symantec.com/public/documents/sfha/6....

--------------------
The VCS high availability daemon (HAD) runs on each system.

Also known as the VCS engine, HAD is responsible for the following functions:

• Builds the running cluster configuration from the configuration files
• Distributes the information when new nodes join the cluster
• Responds to operator input
• Takes corrective action when something fails.

The engine uses agents to monitor and manage resources. It collects information about resource states from the agents on the local system and forwards it to all cluster members.

The local engine also receives information from the other cluster members to update its view of the cluster. HAD operates as a replicated state machine (RSM). The engine that runs on each node has a completely synchronized view of the resource status on each node. Each instance of HAD follows the same code path for corrective action, as required.

The RSM is maintained through the use of a purpose-built communications package. The communications package consists of the protocols Low Latency Transport (LLT) and Group Membership Services and Atomic Broadcast (GAB).

The hashadow process monitors HAD and restarts it when required.

More Information
About inter-system cluster communications
--------------------

About inter-system cluster communications
https://sort.symantec.com/public/documents/sfha/6....

--------------------
About inter-system cluster communications

VCS uses the cluster interconnect for network communications between cluster systems. Each system runs as an independent unit and shares information at the cluster level. On each system the VCS High Availability Daemon (HAD), which has the decision logic for the cluster, maintains a view of the cluster configuration. This daemon operates as a replicated state machine, which means all systems in the cluster have a synchronized state of the cluster configuration. This is accomplished by the following:

• All systems run an identical version of HAD.
• HAD on each system maintains the state of its own resources, and sends all cluster information about the local system to all other machines in the cluster.
• HAD on each system receives information from the other cluster systems to update its own view of the cluster.
• Each system follows the same code path for actions on the cluster.

The replicated state machine communicates over a purpose-built communications package consisting of two components, Group Membership Services/Atomic Broadcast (GAB) and Low Latency Transport (LLT).
--------------------

If this post has helped you, please vote or mark as solution