Node hungs after heartbeat link failure
Created: 21 Dec 2012 | Updated: 15 Jan 2013 | 3 comments
This issue has been solved. See solution.
Hi
I've strange situation in my Lab. I'm trying to test failure secenario for SFHA. I'm runnig SFHA 5.1 SP1 RP3 on RHEL 5.5. When I disconnect all heartbeat one node lost the race and hungs. I've read in admin guide that panicked node restarts and try to connect to Cluster again, but my node hangs.
Regards.
Pawel
Discussion Filed Under:
Comments 3 Comments • Jump to latest comment
You need to double-check IO fencing config.
Look at the error message:
Could not eject node 0 from disk xxxxxx since keys of node 1 are not registered with it.
Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links
Hi,
I try to investigate this problem, but when I gracefull shutdown cluster and vxfen all keys are removed.
When I start vxfen and cluster service all nodes register their keys.
But when I pull down the llt cable I'm getting message: Could not eject node 0 from disk xxxxxx since keys of node 1 are not registered with it.
Veritas only trigger the panic itself, how the system handles that is Linux/OS specific. Check kernel setting:
# sysctl kernel.panic
If value is 0 the system will NOT reboot on panic, it value > 0 it will wait the $value seconds after a panic and reboot the system then.
Would you like to reply?
Login or Register to post your comment.