VCS ERROR V-16-1-10303 Resource XXX (Owner: Unspecified, Group: xxx-sg) is FAULTED (timed out) on sys SEC-XXX
Environment
Solaris 9
Two Node Cluster
SFHA installed
Engine_A.Log
2012/08/11 07:25:59 VCS ERROR V-16-1-10303 Resource XXX (Owner: Unspecified, Group: xx-sg) is FAULTED (timed out) on sys SEC-XXX
Dmesg
Aug 11 07:24:56 SEC-XXX bge: [ID 801593 kern.notice] NOTICE: bge3: link down
Aug 11 07:25:00 SEC-XXX bge: [ID 801593 kern.notice] NOTICE: bge3: link up 100Mbps Full-Duplex
Aug 11 07:25:23 SEC-XXX bge: [ID 801593 kern.notice] NOTICE: bge3: link down
Aug 11 07:25:28 SEC-XXX bge: [ID 801593 kern.notice] NOTICE: bge3: link up 100Mbps Full-Duplex
Aug 11 07:25:59 SEC-XXX Had[3797]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource XXX (Owner: Unspecified, Group: xxx-sg) is FAULTED (timed out) on sys SEC-XXX
Aug 11 07:26:38 SEC-XXX Had[3797]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10205 Group xxx-sg is faulted on system SEC-XXX
Aug 11 07:26:55 SEC-XXX bge: [ID 801593 kern.notice] NOTICE: bge3: link down
Aug 11 07:27:00 SEC-XXX bge: [ID 801593 kern.notice] NOTICE: bge3: link up 100Mbps Full-Duplex
Aug 11 07:28:00 SEC-XXX bge: [ID 801593 kern.notice] NOTICE: bge3: link down
Aug 11 07:28:05 SEC-XXX bge: [ID 801593 kern.notice] NOTICE: bge3: link up 100Mbps Full-Duplex
Aug 11 07:28:11 SEC-XXX Had[3797]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource XXX (Owner: Unspecified, Group: xxx-sg) is FAULTED (timed out) on sys SEC-XXX
Aug 11 07:29:11 SEC-XXX bge: [ID 801593 kern.notice] NOTICE: bge3: link down
Aug 11 07:29:15 SEC-XXX bge: [ID 801593 kern.notice] NOTICE: bge3: link up 100Mbps Full-Duplex
Aug 11 07:29:17 SEC-XXX bge: [ID 801593 kern.notice] NOTICE: bge3: link down
It seems that bge3 faulted (if we see the dmesg logs) thats why the service group failed over to partner node. The bge3 is not a public NIC. (its a NIC through which a hardware device is connected which verify the application queries. This Hardware device is connected to switch and from switch two ethernet cables connected on each nodes bge3 because both nodes can see this device via bge3 ) But as per the error code ""VCS ERROR V-16-1-10303"" its saying something different as per the below TN. Comments required on the above logs
https://sort.symantec.com/ecls/umi/V-16-1-10303
Comments 8 Comments • Jump to latest comment
V-16-1-10303 is a generic message logged by VCS engine (had) for any resource which faults due to entrypoint timeout. In the technote the V-16-1-10303 ERROR message is logged for cvm_clus resource.
Hope this helps!
Regards,
Venkat
Venkata Reddy Chappavarapu, Manager, Storage & Availability Management Group, Symantec Corporation
That VCS log entry is generic and can apply to different resource types.
In this instance, I assume that the problem resource was dependent on the network
cheers
tony
The resource which was faulted is actually a NIC but not public nither private
Any comment will be appreciated. Mark as Solution if your query is resolved
__________________
Thanks in Advance
Zahid Haseeb
zahidhaseeb.wordpress.com
Did you have more messages before this one - I would guess after bge3 failed the VCS monitor timed out for the application which uses bge3 - if the monitor times out 4 times in a row (determined by resource Type attribute FaultOnMonitorTimeOuts), then the resource fails. If this was the case then you should have seen messages in the VCS engine log to this affect.
Mike
UK Symantec Consultant in VCS, GCO, SF, VVR, VxAT on Solaris, AIX, HP-ux, Linux & Windows
If this post has helped you, please vote or mark as solution
Did you have more messages before this one -
- if the monitor times out 4 times in a row
Logs are attached for reference:
Yes I see more messages and its three time in a row.
Any comment will be appreciated. Mark as Solution if your query is resolved
__________________
Thanks in Advance
Zahid Haseeb
zahidhaseeb.wordpress.com
Thanks all for kind words
The bge3 is confgured as a resource in Service Group.
====
Any comment will be appreciated. Mark as Solution if your query is resolved
__________________
Thanks in Advance
Zahid Haseeb
zahidhaseeb.wordpress.com
If bge3 resource is marked as Critical (default), failure will cause failover.
Please review these topics in VCS Admin Guide (see https://sort.symantec.com/documentation )
Controlling VCS behavior
VCS behavior on resource faults
Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links
The log shows that the NIC went offline on SEC-XXX, causing Faulted state.
VCS then did what it is supposed to do: Offline the rest of the SG, and failover to PRI-XXX :
So, VCS did what it was supposed to do.
You need to troubleshoot bge3 on SEC-XXX.
Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links
Would you like to reply?
Login or Register to post your comment.