crs unable to start on reboot: vcsmm: VCS RAC ERROR V-10-1-15013 vcsmm_ioctl: driver not configured

Article:TECH67749  |  Created: 2009-01-11  |  Updated: 2009-01-02  |  Article URL http://www.symantec.com/docs/TECH67749
Article Type
Technical Solution

Product(s)

Environment

Issue



crs unable to start on reboot: vcsmm: VCS RAC ERROR V-10-1-15013 vcsmm_ioctl: driver not configured

Solution



crs unable to start on reboot: vcsmm: VCS RAC ERROR V-10-1-15013 vcsmm_ioctl: driver not configured

Errors seen in the engine_A.log:
-----------------------------------------------
Feb 11 01:26:59 iss2420qbrwdb vcsmm: VCS RAC ERROR V-10-1-15013 vcsmm_ioctl: driver not configured
Feb 11 01:27:01 iss2420qbrwdb last message repeated 8 times
Feb 11 01:27:01 iss2420qbrwdb root: Oracle CSS daemon failed to start up. Check CRS logs for diagnostics.
Feb 11 01:27:01 iss2420qbrwdb vcsmm: VCS RAC ERROR V-10-1-15013 vcsmm_ioctl: driver not configured
Feb 11 01:27:01 iss2420qbrwdb root: Oracle CSS family monitor shutting down.
----------------------------------------------

The above error does not give much information. Complains about "driver not configured"

bash-3.00#  gabconfig -a
GAB Port Memberships
===============================================================
Port a gen   71b40a membership 012
Port b gen   71b40d membership 012
Port d gen   71b409 membership 012
Port f gen   71b415 membership 012
Port h gen   71b40e membership 012
Port v gen   71b411 membership 012
Port w gen   71b412 membership 012


Port 'o' i.e. vcsmm port is not configured in the system due to vcsmm unable to start

Hence we try to start the vcsmm manually and get few better messages:

bash-3.00# /etc/init.d/vcsmm/start

From the engine_A.log:
----------------------------------------------
Feb 11 02:41:01 iss2420qbrwdb vcsmm: VCS RAC ERROR V-10-1-15203 Maximum number of slaves
Feb 11 02:41:01 iss2420qbrwdb   allowed on this node is 2048,
Feb 11 02:41:01 iss2420qbrwdb   whereas the corresponding value on node 2 is 32768.
Feb 11 02:41:01 iss2420qbrwdb   Dropping out of cluster membership
Feb 11 02:41:01 iss2420qbrwdb vcsmm: NOTICE: VCS RAC Notice V-10-1-32779 Message discarded: exiting - not stop: tp-1 nd-0 tk-5 st-4
VCS RAC vcsmmconfigFeb 11 02:41:01 iss2420qbrwdb gab: GAB INFO V-15-1-20032 Port o closed
ERROR V-10-2-5 Configuration failed - version mismatch or max slave parameter value mismatch
----------------------------------------------

The above message gives us a bit more information and clarity on the exact problem

The message explains that on node '2' (running node) the value of slaves_members is set to 32768 but the value in the node which has reboot and trying to join the cluster, the value in the /kernel/drv/vcsmm.conf is set to 2048. This is the same value seen on all the three nodes of the cluster including the node which is running in the cluster configuration.

bash-3.00# cat /kernel/drv/vcsmm.conf
# VCSMM Configuration file
#
name="vcsmm" parent="pseudo" slave_members=2048 instance=0;

We verify the setting on the running system:

bash-3.00#  echo mm_slave_max/D|mdb -k
mm_slave_max:
mm_slave_max:   32768

We can also use the vcsmmdebug utility

bash-3.00# /opt/VRTSvcs/rac/bin/vcsmmdebug -t
Tunables Information
       mm_deblog_sz: 65536
       mm_msglog_sz: 128
       mm_slave_max: 32768
       timeout     : 5

Hence to resolve the situation, we set the value on all the nodes in the /kernel/drv/vcsmm.conf to look like:

name="vcsmm" parent="pseudo" slave_members=32768 instance=0;

Now we reboot the nodes which are unable to join the cluster which is recommended. If the reboot is not possible, unload and re-load the vcsmm module:

bash-3.00#  haconf -dump -makero
bash-3.00#  hastop -local -force
bash-3.00#  /etc/init.d/vcsmm start
bash-3.00#  gabconfig -a
bash-3.00# hastart

------------------------------------------------------------------------------------------------------------------------------------------
The kernel turable value can be set on the live kernel using the mdb debugger. (Please note that this is not recommended in production systems)
------------------------------------------------------------------------------------------------------------------------------------------

bash-3.00# echo mm_slave_max/D |mdb -k
mm_slave_max:
mm_slave_max:   32768

To set the value of mm_slave_max to value of 2048 we use the Hexadecimal value:
Dec 2048 = Hex 800

bash-3.00# echo "mm_slave_max/W 800" |mdb -kw
mm_slave_max:   0x8000          =       0x800

Verify:

bash-3.00# echo mm_slave_max/D |mdb -k
mm_slave_max:
mm_slave_max:   2048

bash-3.00# /opt/VRTSvcs/rac/bin/vcsmmdebug -t
Tunables Information
       mm_deblog_sz: 65536
       mm_msglog_sz: 128
       mm_slave_max: 2048
       timeout     : 5

Now to set the value of mm_slave_max back to value of 32768 we use the Hexadecimal value:
Dec 32768 = Hex 8000

bash-3.00# echo "mm_slave_max/W 8000" |mdb -kw
mm_slave_max:   0x800           =       0x8000

Verify:

bash-3.00# echo mm_slave_max/D |mdb -k
mm_slave_max:
mm_slave_max:   32768

bash-3.00# /opt/VRTSvcs/rac/bin/vcsmmdebug -t
Tunables Information
       mm_deblog_sz: 65536
       mm_msglog_sz: 128
       mm_slave_max: 32768
       timeout     : 5


Legacy ID



319344


Article URL http://www.symantec.com/docs/TECH67749


Terms of use for this information are found in Legal Notices