Video Screencast Help

Cluster service hangs on starting, 2nd attempts succeeds

Created: 21 Mar 2010 • Updated: 21 May 2010 | 4 comments
ODT's picture
This issue has been solved. See solution.

Situation
Windows 2003 R2 SP2 x64 on a DL580 G5 using SFW 5.1 with cluster extension. MSCS is installed using the SFW manual. EVA connected through HP Branded Brocade's.
SFW Cluster validation passed with no errors
MSCS Cluster validation passed with no errors

Whenever I use LDM to create a Quorum, the cluster boots and fails over with no problem.

I created a dynamic quorum using the manual. The cluster service hangs on starting, it succeeds when booting manually or through the recover tab in the services console.
As I test I then created a simple quorum, no difference. When I create a quorum through LDM it works.
This is the relevant excerpt of my cluster.log

0000e58.00000e70::2010/03/19-13:14:07.644 INFO Physical Disk <quorumtemp>: [DiskArb] DisksOpenResourceFileHandle: Attaching to disk with signature c74e7e9e
00000e58.00000e70::2010/03/19-13:14:07.644 INFO Physical Disk <quorumtemp>: [DiskArb] DisksOpenResourceFileHandle: Disk unique id present trying new attach
00000e58.00000e70::2010/03/19-13:14:07.927 INFO Physical Disk <quorumtemp>: [DiskArb] DisksOpenResourceFileHandle: Retrieving disk number from ClusDisk registry key
00000e58.00000e70::2010/03/19-13:14:07.927 ERR  Physical Disk <quorumtemp>: Arbitrate: Unable to open ClusDisk signature key c74e7e9e. Error: 2.

Any pointers?
The registry key is existing under(of the top of my head, not sure, but you know what I mean) HLKM\system\currencontrolset\services\clusdisk\signatures\%disksignature%

Comments 4 CommentsJump to latest comment

ODT's picture

Just ot be complete. The event id logged is 1009
"Cluster service could not join an existing server cluster and could not form a new server cluster. Cluster service has terminated."

Also some more of cluster.log

00000e54.00000e6c::2010/03/21-16:51:29.953 INFO Volume Manager Disk Group <DynamicQuorum>: LDM_RESArbitrate: >>> Entering LDM_RESArbitrate
00000e54.00000e6c::2010/03/21-16:51:29.953 ERR  Volume Manager Disk Group <DynamicQuorum>: The vxvm service is running.
00000e54.00000e74::2010/03/21-16:51:29.953 INFO Volume Manager Disk Group <DynamicQuorum>: LDM_RESArbitrateThread: >>> Entering LDM_RESArbitrateThread
00000e54.00000e74::2010/03/21-16:51:29.953 ERR  Volume Manager Disk Group <DynamicQuorum>: LDM_RESArbitrateThread: Calling  ArbitrateDg API
00000e54.00000e74::2010/03/21-16:55:02.156 ERR  Volume Manager Disk Group <DynamicQuorum>: LDM_RESArbitrateThread - *** ArbitrateDg API failed - resource is not available - error: No disk in the disk group
00000e54.00000e74::2010/03/21-16:55:02.156 INFO Volume Manager Disk Group <DynamicQuorum>: LDM_RESArbitrateThread: <<< Exiting LDM_RESArbitrateThread
00000e54.00000e6c::2010/03/21-16:55:02.156 INFO Volume Manager Disk Group <DynamicQuorum>: LDM_RESArbitrate: <<< Exiting LDM_RESArbitrate
00000bc8.00000bd4::2010/03/21-16:55:02.156 INFO [MM] MmSetQuorumOwner(0,0), old owner 1.
00000bc8.00000bd4::2010/03/21-16:55:02.156 ERR  [FM] FmGetQuorumResource failed, error 5006.
00000bc8.00000bd4::2010/03/21-16:55:02.156 ERR  [INIT] ClusterForm: Could not get quorum resource. No fixup attempted. Status = 5086
00000bc8.00000bd4::2010/03/21-16:55:02.156 INFO [INIT] Cleaning up failed form attempt.

ODT's picture

Disabling the Windows Firewall/Windows ICS service fixed this. Connectivity issue it seems

SOLUTION
Wally_Heim's picture

Hi ODT,

Good to hear that you have resolved your issue. 

Since you are using SFW with MSCS, I would recommend that you go to 5.1 SP1.  However, there is a problem with dynamic quorum/witness disks with SP1.  It has been resolved with a private patch that you will need to open a support case to get.  We are working on making the patch public but that has not happened yet. 

Here is a link with more details of the 5.1 SP1 dynamic quorum/witness disk issue:

http://seer.entsupport.symantec.com/docs/346617.htm

Thanks,
Wally

ODT's picture

I wasn't complete after all. SP1 was already install

I now ran into a new problem. I have my disks mirrored across 2 EVA's(I selected the option "mirrored across enclosures")

When I unpresent disks they keep on functioning, but when I shutdown the EVA the disks get deported. Am I missing something here?