System panic with "create_rule_from_local_file" and "VCS CRITICAL V-16-1-1072" in /var/adm/messages

Article:TECH146911  |  Created: 2010-12-23  |  Updated: 2012-07-31  |  Article URL http://www.symantec.com/docs/TECH146911
Article Type
Technical Solution


Environment

Issue



System goes panic with error messages "create_rule_from_local_file" and "VCS CRITICAL V-16-1-1072"


Error



[ ERROR  MESSAGES ]

Dec 20 02:11:39 Symc-solaris vasd[8709]: [ID 396597 daemon.error] _create_rule_from_local_file: Access control update failed. Cannot resolve access control group VAS-STAGE-Access. Error 2. Group lookup will be attempted again in 30 seconds.
Dec 20 02:11:39 Symc-solaris Had[12443]: [ID 702911 daemon.notice] VCS CRITICAL V-16-1-1072 (Symc-solaris) DiskGroup:panic_diskgroup:clean:Monitor hung for diskgroup (panic_diskgroup) on system Symc-solaris. System will panic to migrate all service groups to another VCS node in system list
Dec 20 02:11:39 Symc-solaris unix: [ID 836849 kern.notice]
Dec 20 02:11:39 Symc-solaris ^Mpanic[cpu198]/thread=3018bdab4e0:
Dec 20 02:11:39 Symc-solaris unix: [ID 156897 kern.notice] forced crash dump initiated at user request
Dec 20 02:11:39 Symc-solaris unix: [ID 100000 kern.notice] sdconthp004 genunix: [ID 723222 kern.notice] 000002a11c98b960 genunix:kadmin+4ac (b4, 0, 0, 127b000, 5, 0)
Dec 20 02:11:39 Symc-solaris genunix: [ID 179002 kern.notice] %l0-3: 000000000182b400 00000000011e5c00 0000000000000004 0000000000000004
Dec 20 02:11:39 Symc-solaris %l4-7: 0000000000000440 0000000000000010 0000000000000004 0000000000000000
Dec 20 02:11:39 Symc-solaris genunix: [ID 723222 kern.notice] 000002a11c98ba20 genunix:uadmin+11c (3009d6290a8, 0, 0, ff390000, 0, 0)
Dec 20 02:11:39 Symc-solaris genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 0000000000000000 0000000089430000 0000000000008943
Dec 20 02:11:39 Symc-solaris %l4-7: 0000000000000001 0000000000000000 0000000000000005 000003018bdab4e0
Dec 20 02:11:39 Symc-solaris unix: [ID 100000 kern.notice]
Dec 20 02:11:39 Symc-solaris genunix: [ID 672855 kern.notice] syncing file systems...


Environment



- Solaris 10
- SFHA5.0, 5.0MP3


Cause



[ EXPALNATION ]
If the value of this attribute is 1 and the disk group becomes disabled, the node panics.

1. To prevent VCS from panicing boxes due to DiskGroup monitor timeouts we can use  the PanicSystemOnDGLoss attribute.
2. This not only controls if a system is paniced when there is a dg disable but also if the diskgroup clean script is called due to monitor timeout.
3. However, it would be extremely useful to have any crashdump created as a result of VCS pancing a box due to DiskGroup monitor timeouts as this will likely be when vxconfigd is very busy.

 

 


Solution



[ RESOLUTIION]
1. Most of all, it is required to check if there are system issue such as storages and disks.
2. Then to prevent this panic from VCS engine, change the parameter according to the steps followed.

How to check the default value for the PanicSystemOnDGLoss:
eg. We see the value is set to 1 (enabled) by default.
# grep "boolean Panic" /etc/VRTSvcs/conf/config/types.cf
       boolean PanicSystemOnDGLoss = 1

How to change the default value for PanicSystemOnDGLoss:
# haconf -makerw
# haattr -default DiskGroup PanicSystemOnDGLoss 0
# haconf -dump -makero

Note : haattr command preserves the existing resource values by modifying the main.cf, Check the Resource Values and modify them with "hares -modify" as per your requirement.

Eg:

# grep -i panicsystemondgloss /etc/VRTSvcs/conf/config/main.cf
# grep -i panicsystemondgloss /etc/VRTSvcs/conf/config/types.cf
        static str ArgList[] = { DiskGroup, StartVolumes, StopVolumes, MonitorOnly, MonitorReservation, tempUseFence, PanicSystemOnDGLoss, DiskGroupType, UmountVolumes }
        boolean PanicSystemOnDGLoss = 1
# haconf -makerw
# haattr -default DiskGroup PanicSystemOnDGLoss 0
# haconf -dump -makero
# grep -i panicsystemondgloss /etc/VRTSvcs/conf/config/main.cf     --> to preserve the existing resources values, the main.cf was modified.
                PanicSystemOnDGLoss = 1
                PanicSystemOnDGLoss = 1
# grep -i panicsystemondgloss /etc/VRTSvcs/conf/config/types.cf
        static str ArgList[] = { DiskGroup, StartVolumes, StopVolumes, MonitorOnly, MonitorReservation, tempUseFence, PanicSystemOnDGLoss, DiskGroupType, UmountVolumes }
        boolean PanicSystemOnDGLoss = 0

 

Confirm that both the temporary and persistent definitions have been reset to 0:
# haattr -display DiskGroup |grep Panic
PanicSystemOnDGLoss [boolean/scalar]     =      0

# grep "boolean Panic" /etc/VRTSvcs/conf/config/types.cf
       boolean PanicSystemOnDGLoss = 0



Article URL http://www.symantec.com/docs/TECH146911


Terms of use for this information are found in Legal Notices