Video Screencast Help

Volumes need fsck to run in HA / DR environment during failover

Created: 22 May 2009 • Updated: 21 May 2010 | 7 comments
This issue has been solved. See solution.

Hi All,
             I have configured cluster service group via cluster configuration wizard followed by App and App rep service group for my replication and application resources. I have configured a resource called Mount under App service group.

Some times, when fail over occurs, I could see some problems with Mount resources and it needs fsck to be done on the secondary server. This problem is comig when i reboot my servers also. While going thro some sites via google, i could see that this is a known issue with VCS 5.0 . Is it so ?

If so how to fix this problem permanently ? because running fsck some times, makes my file to remove from that location and places an inode entry in lost+found directory. Due to this I have to re-install my application , which is totally unnecessary.

Error snippet from Engine_A.log:

2009/03/25 14:54:11 VCS INFO V-16-2-13001 (bundle-sunfirev490-2) Resource(cscopx_Mount): Output of the completed operation (online)
UX:vxfs mount: ERROR: V-3-21252: not super user
/opt/CSCOpx_LMS32:
UX:vxfs mount: ERROR: V-3-21252: not super user
UX:vxfs fsck: ERROR: V-3-20003: Cannot open /dev/vx/rdsk/datadg/cscopx: Not owner
file system check failure, aborting ...
UX:vxfs mount: ERROR: V-3-21252: not super user
UX:vxfs fsck: ERROR: V-3-20003: Cannot open /dev/vx/rdsk/datadg/cscopx: Not owner
file system check failure, aborting ...
UX:vxfs mount: ERROR: V-3-21252: not super user

In this case, i use to stop the cluster, run the following fsck command manually to the cscopx and varcscopx volumes and again try to mount the resource via cluster by starting it. During manual start, only my application is getting affected some times.

fsck -o full -F vxfs -y /dev/vx/dsk/datadg/varcscopx
fsck -o full -F vxfs -y /dev/vx/dsk/datadg/cscopx

Can anyone help me to solve this problem.

Note : The logged in user is a full previleged user for that server

With Regards,
Sri.

Discussion Filed Under:

Comments 7 CommentsJump to latest comment

Gaurav Sangamnerkar's picture

Hello Sri,

Is your replication consistent ? Have you anytime verified the date between primary & secondary ?

vradmin utility provides you features to verify whether your data is consistent across primary & secondary or not.....

Gaurav

PS: If you are happy with the answer provided, please mark the post as solution. You can do so by clicking link "Mark as Solution" below the answer provided.
 

Sridhar_sri's picture

Yes Gaurav, replication status is consitent and up-to date.

I confirmed the same with vradmin -g <disk group name> repstatus <RVG name>

Hope this is the command to chk

One more doubt, if in case ,there was replcaiton going on, due to some reasons, my server is down, in that case , my volume gets corrupted.. there is no way to use my volume in secondary server without using fsck command

With Regards,
Sri

Gaurav Sangamnerkar's picture

Hello,

That command might show you that link is uptodate but that doesn't confirms that data on primary & secondary is consistent !!

There would be options like "verify" to check consistency of data....

The next condition what you are mentioning, I believe fsck would be the right action because if server goes down when writes were taking place, that would definately put a flag that FS was dirty & hence an fsck will be needed....

Gaurav

PS: If you are happy with the answer provided, please mark the post as solution. You can do so by clicking link "Mark as Solution" below the answer provided.
 

Sridhar_sri's picture

HI Gaurav,
                      As you suggested , i have tried for verify command in vradmin and following is the observation.

bash-3.00# vradmin verifydata LMS_HA_RVG 10.77.213.169 cache=cacheobj
Message from Primary:
VxVM VVR vradmin ERROR V-5-4-2411 Volumes under RVG LMS_HA_RVG are not prepared for the instant snapshot.
Message from Host 10.77.213.169:
VxVM VVR vradmin ERROR V-5-4-2411 Volumes under RVG LMS_HA_RVG are not prepared for the instant snapshot.

I think this is for taking snapshot. Hope i am not worng. And there is an verify option available in vxrlink command, and result is as follows.

bash-3.00# vxrlink -g datadg verify rlk_10.77.213.169_LMS_HA_RVG
RLINK REMOTE HOST LOCAL HOST STATUS STATE
rlk_10.77.213.169_LMS_HA_RVG 10.77.213.169 10.77.213.173 OK ACTIVE

this command seems to chk the strengh of the vxrlink. 

Can you please tell me how these commands are useful for me to avoid the fsck command in secondary server ?

Thanks,
Sri

Sridhar_sri's picture

Hi,
         Can anyone guide me a solution for this ?

With Regards,
Sri

Eric.Hennessey's picture

Hi Sri,

The nature of those error messages suggests that maybe you should open up a support case.  VCS always runs as root...the message "not super user" is indicative of a deeper problem than is likely to be resolved in this forum.

Eric

Business Continuity Solutions Evangelist

SOLUTION
Sridhar_sri's picture

HI Eric,
                 Thanks for ur reply...

Thanks,

Sri.