Video Screencast Help

CVM slave node join time extremely long than usual on M9000

Created: 05 Aug 2013 | 2 comments

Hello there,

We have recently setup 2 M9000 nodes with the following configuration
Sun Cluster 3.2 Update 3 + Infiniband heartbeat
Solaris 10 Update 11 + MPxIO
Veritas Volume Manageer CVM 5.1 SP1 RP3 P2
VSP storage, data disk group by true copy

We have a total of 9 disk groups, disk group 1-7 are relatively "small", each have several hundred volumes, disk group 8 have 1300 volumes, disk group have more than 5000 volumes.

This software configuration works normal in the old E25k servers(We didn't install rp3 in this server and OS is sol10u8+eis 2010/10), when we reboot any node, the CVM slave node join process only takes less than 10 minutes which is accpetable, though generating some messages as the new environment. When we copied data from the old data to the new storage by HDS true copy, this M9000 slave node join process takes extremely longer time which is almost 4 hours.

If we only tried to import disk group 1-8, this slave join process of CVM_Step4 only took 1 minute and generated few error messages. However, when we imported disk group 9 on the master and rebooted the slave node, the slave node join took 4 hours again. This seems to be the problem of the big sized disk group.

typical error messages:
Aug 5 11:24:52 hq-boss-sdb-3 vxio: [ID 943215 kern.notice] NOTICE: VxVM vxio V-5-3-1540 cfg: object vol4339 already exists in new object list. nothing to do.
Aug 5 11:24:52 hq-boss-sdb-3 vxio: [ID 712535 kern.notice] NOTICE: VxVM vxio V-5-3-1435 msg_cfg_object: parent of diskv1_141-257 missing, drop
Aug 5 11:24:52 hq-boss-sdb-3 vxio: [ID 712535 kern.notice] NOTICE: VxVM vxio V-5-3-1435 msg_cfg_object: parent of diskv1_146-267 missing, drop
Aug 5 11:24:52 hq-boss-sdb-3 vxio: [ID 712535 kern.notice] NOTICE: VxVM vxio V-5-3-1435 msg_cfg_object: parent of diskv0_18-136 missing, drop
Aug 5 11:24:52 hq-boss-sdb-3 vxio: [ID 712535 kern.notice] NOTICE: VxVM vxio V-5-3-1435 msg_cfg_object: parent of diskv1_141-258 missing, drop
Aug 5 11:24:52 hq-boss-sdb-3 vxio: [ID 712535 kern.notice] NOTICE: VxVM vxio V-5-3-1435 msg_cfg_object: parent of diskv1_173-73 missing, drop
Aug 5 11:24:52 hq-boss-sdb-3 vxio: [ID 712535 kern.notice] NOTICE: VxVM vxio V-5-3-1435 msg_cfg_object: parent of diskv1_147-248 missing, drop
Aug 5 11:24:52 hq-boss-sdb-3 vxio: [ID 712535 kern.notice] NOTICE: VxVM vxio V-5-3-1435 msg_cfg_object: parent of diskv1_175-66 missing, drop
Aug 5 11:24:52 hq-boss-sdb-3 vxio: [ID 712535 kern.notice] NOTICE: VxVM vxio V-5-3-1435 msg_cfg_object: parent of diskv0_18-159 missing, drop
Aug 5 11:24:52 hq-boss-sdb-3 vxio: [ID 712535 kern.notice] NOTICE: VxVM vxio V-5-3-1435 msg_cfg_object: parent of diskv1_181-59 missing, drop
Aug 5 11:24:52 hq-boss-sdb-3 vxio: [ID 712535 kern.notice] NOTICE: VxVM vxio V-5-3-1435 msg_cfg_object: parent of diskv0_14-149 missing, drop
Aug 5 11:24:52 hq-boss-sdb-3 vxio: [ID 712535 kern.notice] NOTICE: VxVM vxio V-5-3-1435 msg_cfg_object: parent of diskv1_176-75 missing, drop
Aug 5 11:24:52 hq-boss-sdb-3 vxio: [ID 712535 kern.notice] NOTICE: VxVM vxio V-5-3-1435 msg_cfg_object: parent of diskv0_21-141 missing, drop
Aug 5 11:24:52 hq-boss-sdb-3 vxio: [ID 712535 kern.notice] NOTICE: VxVM vxio V-5-3-1435 msg_cfg_object: parent of diskv1_144-263 missing, drop
Aug 5 11:24:52 hq-boss-sdb-3 vxio: [ID 712535 kern.notice] NOTICE: VxVM vxio V-5-3-1435 msg_cfg_object: parent of diskv1_180-78 missing, drop
Aug 5 11:24:52 hq-boss-sdb-3 vxio: [ID 712535 kern.notice] NOTICE: VxVM vxio V-5-3-1435 msg_cfg_object: parent of diskv1_180-62 missing, drop
Aug 5 11:24:52 hq-boss-sdb-3 vxio: [ID 712535 kern.notice] NOTICE: VxVM vxio V-5-3-1435 msg_cfg_object: parent of diskv0_16-153 missing, drop
Aug 5 11:24:52 hq-boss-sdb-3 vxio: [ID 712535 kern.notice] NOTICE: VxVM vxio V-5-3-1435 msg_cfg_object: parent of diskv0_16-167 missing, drop
Aug 5 11:24:54 hq-boss-sdb-3 vxio: [ID 712535 kern.notice] NOTICE: VxVM vxio V-5-3-1435 msg_cfg_object: parent of diskv1_177-60 missing, drop
Aug 5 11:24:54 hq-boss-sdb-3 vxio: [ID 712535 kern.notice] NOTICE: VxVM vxio V-5-3-1435 msg_cfg_object: parent of diskv0_14-150 missing, drop
Aug 5 11:24:54 hq-boss-sdb-3 vxio: [ID 712535 kern.notice] NOTICE: VxVM vxio V-5-3-1435 msg_cfg_object: parent of diskv1_177-76 missing, drop

master node when 2 nodes boot together:
Aug 5 10:15:09 hq-boss-sdb-4 ID[vxclust]: [ID 595215 local0.error] vol_set_membership:Reconf Seqno:1
Aug 5 10:15:09 hq-boss-sdb-4 ID[vxclust]: [ID 701594 local0.error] VOLCVM_CONFIG Successful
Aug 5 10:15:09 hq-boss-sdb-4 ID[vxclust]: [ID 246066 local0.error] New Master:1
Aug 5 10:15:09 hq-boss-sdb-4 ID[vxclust]: [ID 587906 local0.error] ending step step2 time: 08/05 10:15:09.283:
Aug 5 10:15:10 hq-boss-sdb-4 ID[vxclust]: [ID 514886 local0.error] starting step3 time: 08/05 10:15:10.311: seq # 1
Aug 5 10:15:36 hq-boss-sdb-4 ID[vxclust]: [ID 986289 local0.error] ending step step3 time: 08/05 10:15:36.671:
Aug 5 10:15:36 hq-boss-sdb-4 ID[vxclust]: [ID 770722 local0.error] starting step4 time: 08/05 10:15:36.875: seq # 1
Aug 5 11:26:00 hq-boss-sdb-4 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-14597 join completed for node 1 reconfig 1
Aug 5 11:26:00 hq-boss-sdb-4 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-4123 cluster established successfully
Aug 5 11:26:03 hq-boss-sdb-4 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-5967 req_config: clearing R_CACHE on record 63ba58 type 6
Aug 5 11:26:03 hq-boss-sdb-4 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-5967 req_config: clearing R_CACHE on record 63c138 type 6
Aug 5 11:26:03 hq-boss-sdb-4 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-5967 req_config: clearing R_CACHE on record 69f8c0 type 6
Aug 5 11:26:03 hq-boss-sdb-4 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-5967 req_config: clearing R_CACHE on record 69ffa0 type 6

vxclust process of step4 hang there for 4 hours and rac-framework-rs waited the error messages to an end for the same time. All the msg_cfg_object messages only shows in the slave node. After that, the 2 nodes become normal, all data can be mounted by Oracle RAC.

Does anyone know how to resolve this issue?

Regards
Edwin

Operating Systems:

Comments 2 CommentsJump to latest comment

Daniel Matheus's picture

Hi Edwin,

 

can you please check that the CVM protocol version is the same on all nodes?

 

This will recreate the volboot file

http://www.symantec.com/business/support/index?pag...

 

You might also need to upgrade the protocol version and the diskgroup version.

 

http://sfdoccentral.symantec.com/sf/5.0/solaris64/...

 

Thanks,
Dan

 

If this post has helped you, please vote or mark as solution

Edwin Ye's picture

Hi Dan,

 

   Thank you for your reply.

   The protocol version is the same

root@hq-boss-sdb-3 # vxdctl list
Volboot file
version: 3/1
seqno:   0.2
cluster protocol version: 100
hostid:  hq-boss-sdb-3
hostguid:  {ae98d11a-f692-11e2-a57e-0010e00ae9a0}

root@hq-boss-sdb-3 # more /etc/vx/volboot
volboot 3.1 0.2 100
hostid hq-boss-sdb-3
hostguid {ae98d11a-f692-11e2-a57e-0010e00ae9a0}
request_threads 2
Command_Shipping 1
end

root@hq-boss-sdb-4 # vxdctl list
Volboot file
version: 3/1
seqno:   0.1
cluster protocol version: 100
hostid:  hq-boss-sdb-4
hostguid:  {b068cbd0-f692-11e2-a09b-0010e00aec20}
root@hq-boss-sdb-4 # more /etc/init/volboot
/etc/init/volboot: Not a directory
root@hq-boss-sdb-4 # more /etc/vx/volboot
volboot 3.1 0.1 100
hostid hq-boss-sdb-4
hostguid {b068cbd0-f692-11e2-a09b-0010e00aec20}
request_threads 2
Command_Shipping 1
end

 

 disk group version is the same as the old system, version is 160

root@hq-boss-sdb-4 # vxdctl support
Support information:
  vxconfigd_vrsn:   32
  dg_minimum:       20
  dg_maximum:       160
  kernel:           32
  protocol_minimum: 90
  protocol_maximum: 100
  protocol_current: 100

 

Regards

 

Edwin