cluster behavior needed, which cfg vars to modify
Hallo,
I wish to have the following behavior from a Veritas cluster, monitoring a resource (app):
resource failed, first attempt to restart it on the same node, if not, migrate it to the second node.
However, is there another monitor which forces the resource to directly migrate if it fails too many times in a given timeframe, instead on starting it again on the same node ?
When testing, I have different behaviors depending on how much time I wait between manually killing the app and I do not know exactly which configurations I have to edit. basically, the question is how much time do I have between manually failing the resource, so the cluster restarts it again on the _same_ node?
cfg so far -> ToleranceLimit = 0 RestartLimit = 1 OnlineTimeout = 300.
Comments 1 Comment • Jump to latest comment
The attribute you are missing is
So with default ConInterval of 600 sec (10 mins) with:
RestartLimit=1, a resource will be restarted once and if it fails again within 10 mins it will cause failover but if it fails after 10 mins then it will be restarted again
ToleranceLimit=1, a failure will be ignored the first time and if it fails again within 10 mins it will cause failover but if it fails after 10 mins then it will be ignored again.
Mike
UK Symantec Consultant in VCS, GCO, SF, VVR, VxAT on Solaris, AIX, HP-ux, Linux & Windows
If this post has helped you, please vote or mark as solution
Would you like to reply?
Login or Register to post your comment.