Video Screencast Help

Volumes in status of vxrecover after every restart

Created: 02 Apr 2013 • Updated: 29 Apr 2013 | 5 comments
yairz's picture
This issue has been solved. See solution.

Hi all,

 

I'm need of an advice if anyone can help:

I have a few environments of SUN Solaris servers connected to SUN Storage in a clustered environment.

The version is Storage Foundation HA 5.1 SP1.

The environment contains three volumes, each one is a mirror between disks from two storage arrays.

It appears that every time I restart the environment or even run hastop and hastart the following happens:

[root@VN-HAN01-PTXASDB-02: /]# vxtask list

TASKID  PTID TYPE/STATE    PCT   PROGRESS

   165           PARENT/R  0.00% 1/0(1) VXRECOVER

   166   165     ATCOPY/R 00.72% 0/419346432/3010560 PLXATT account_vol account_vol-02 dg_acct

This causes the DG resources in the cluster to not complete the online procedure and evantually fail due to timeout, even though I can manually start the volumes and mount them.

to overcome this I just freeze the SG and manually start the system until the task finishes and then start the resources in the cluster.

My question is what can cause this issue?

 

Any help would be much appreciated.

Thanks,

Yair

 

Operating Systems:

Comments 5 CommentsJump to latest comment

mikebounds's picture

Can you provide you main.cf and "vxprint -th" output

Mike

UK Symantec Consultant in VCS, GCO, SF, VVR, VxAT on Solaris, AIX, HP-ux, Linux & Windows

If this post has answered your question then please click on "Mark as solution" link below

stinsong's picture

This means the PLEX account_vol-02 of VOLUME account_vol is not sync completely in mirror pair. I think it's related the way you stop the VCS or shutdown if this is exactly the same scene you got every time you run hastart.

Usually it means mirror data not sync between 2 plexes of a mirrored volume when it stop or the DG deport. Considerring your mirror is setup on 2 disk array, maybe the IO performance is bad between them which makes mirror volume cannot sync data timely when the DG been deported / stopped.

You must wait and confirm the task complete for the volume account_vol which could check from "vxprint -ht" and see the volume KSTAT and STAT are ENABLED  ACTIVE. And then check if the event repeat on next time.

yairz's picture

Hi Mike and Stinsong,

Thanks for your replies.

 

The system was on its way to its target datacenter so I did not have connection to the servers until today.

 

I performed the same test and had the same result. The output Mike requested is detailed below.

Usually I either run standard shutdown command to reboot the servers or run "hastop -all" and then hastart on each system, nothing different than any other setup I have and I hadn't seen this behaviour elsewhere.

I do take Stinsong's remark regarding I/O between storage arrays under consideration and I will check it.

 

Below is an excerpt of main.cf with the Service Group containing the DG's with this issue:

 

group ACCT-SG (

                SystemList = { PreProd-PTXDB-01 = 0, PreProd-PTXDB-02 = 1 }

                AutoStartList = { PreProd-PTXDB-01, PreProd-PTXDB-02 }

                )

 

                Application acctd_app (

                                User = iprs

                                StartProgram = "/usr/local/iprs/bin/acctd_runner.sh start"

                                StopProgram = "/usr/local/iprs/bin/acctd_runner.sh stop"

                                PidFiles = { "/var/iprs/acctd.pid" }

                                RestartLimit = 3

                                )

 

                DiskGroup acct-dg (

                                DiskGroup = dg_acct

                                )

 

                IPMultiNIC acct-vip (

                                Address = "172.19.41.71"

                                NetMask = "255.255.255.240"

                                MultiNICResName = IPRS_MNIC_res

                                IfconfigTwice = 1

                                )

 

                Mount acct-mnt (

                                MountPoint = "/export/account"

                                BlockDevice = "/dev/vx/dsk/dg_acct/account_vol"

                                FSType = vxfs

                                FsckOpt = "-y"

                                )

 

                Proxy acct-nic (

                                TargetResName = IPRS_MNIC_res

                                )

 

                acct-mnt requires acct-dg

                acct-vip requires acct-nic

                acctd_app requires acct-mnt

                acctd_app requires acct-vip

 

.

.

.

.

group IPRS-DB (

                SystemList = { PreProd-PTXDB-01 = 0, PreProd-PTXDB-02 = 1 }

                AutoStartList = { PreProd-PTXDB-01, PreProd-PTXDB-02 }

                )

 

                DiskGroup backup-dg (

                                Critical = 0

                                DiskGroup = dg_backup

                                )

 

                DiskGroup ora-dg (

                                DiskGroup = dg_data

                                )

 

                IPMultiNIC ora-vip (

                                Address = "172.19.41.70"

                                NetMask = "255.255.255.240"

                                MultiNICResName = IPRS_MNIC_res

                                IfconfigTwice = 1

                                )

 

                Mount backup-mnt (

                                Critical = 0

                                MountPoint = "/export/backup"

                                BlockDevice = "/dev/vx/dsk/dg_backup/backup_vol"

                                FSType = vxfs

                                FsckOpt = "-y"

                                )

 

                Mount ora-mnt (

                                MountPoint = "/data1"

                                BlockDevice = "/dev/vx/dsk/dg_data/data_vol"

                                FSType = vxfs

                                FsckOpt = "-y"

                                )

 

                Netlsnr ora-lsnr (

                                Owner = oracle

                                Home = "/export/oracle/product/11.2.0/dbhome_1"

                                TnsAdmin = "/export/oracle/product/11.2.0/dbhome_1/network/admin/"

                                Listener = LISTENER

                                )

 

                Oracle ora-iprsdb (

                                Sid = IPRSDB

                                Owner = oracle

                                Home = "/export/oracle/product/11.2.0/dbhome_1"

                                Pfile = "/export/oracle/product/11.2.0/dbhome_1/dbs/initIPRSDB.ora"

                                )

 

                Proxy ora-nic (

                                TargetResName = IPRS_MNIC_res

                                )

 

                backup-mnt requires backup-dg

                ora-iprsdb requires ora-mnt

                ora-lsnr requires ora-iprsdb

                ora-lsnr requires ora-vip

                ora-mnt requires ora-dg

                ora-vip requires ora-nic

The following is an output of "vxdg list" "vxtask list" and "vxprint -th" as requested by Mike:

 

[root@PreProd-PTXDB-01: /]# vxdg list

NAME         STATE           ID

dg_backup    enabled,cds          1356387368.26.PreProd-PTXDB-01

[root@PreProd-PTXDB-01: /]#

[root@PreProd-PTXDB-01: /]# vxtask list

TASKID  PTID TYPE/STATE    PCT   PROGRESS

   161           PARENT/R  0.00% 1/0(1) VXRECOVER

   162   161   RDWRBACK/R 05.26% 0/733902848/38580224 RESYNC backup_vol dg_backup

[root@PreProd-PTXDB-01: /]# vxprint -th

Disk group: dg_backup

 

DG NAME         NCONFIG      NLOG     MINORS   GROUP-ID

ST NAME         STATE        DM_CNT   SPARE_CNT         APPVOL_CNT

DM NAME         DEVICE       TYPE     PRIVLEN  PUBLEN   STATE

RV NAME         RLINK_CNT    KSTATE   STATE    PRIMARY  DATAVOLS  SRL

RL NAME         RVG          KSTATE   STATE    REM_HOST REM_DG    REM_RLNK

CO NAME         CACHEVOL     KSTATE   STATE

VT NAME         RVG          KSTATE   STATE    NVOLUME

V  NAME         RVG/VSET/CO  KSTATE   STATE    LENGTH   READPOL   PREFPLEX UTYPE

PL NAME         VOLUME       KSTATE   STATE    LENGTH   LAYOUT    NCOL/WID MODE

SD NAME         PLEX         DISK     DISKOFFS LENGTH   [COL/]OFF DEVICE   MODE

SV NAME         PLEX         VOLNAME  NVOLLAYR LENGTH   [COL/]OFF AM/NM    MODE

SC NAME         PLEX         CACHE    DISKOFFS LENGTH   [COL/]OFF DEVICE   MODE

DC NAME         PARENTVOL    LOGVOL

SP NAME         SNAPVOL      DCO

EX NAME         ASSOC        VC                       PERMS    MODE     STATE

SR NAME         KSTATE

 

dg dg_backup    default      default  4000     1356387368.26.PreProd-PTXDB-01

 

dm dg_backup01  st2540-3_1   auto     65536    733904640 -

dm st2540-2_1   st2540-2_1   auto     65536    733904640 -

 

v  backup_vol   -            ENABLED  SYNC     733902848 SELECT   -        fsgen

pl backup_vol-01 backup_vol  ENABLED  ACTIVE   733902848 CONCAT   -        RW

sd st2540-2_1-01 backup_vol-01 st2540-2_1 0    733902848 0        st2540-2_1 ENA

pl backup_vol-02 backup_vol  ENABLED  ACTIVE   733902848 CONCAT   -        RW

sd dg_backup01-01 backup_vol-02 dg_backup01 0  733902848 0        st2540-3_1 ENA

 

I only listed the parts related to dg_backup, I can provide similar information on other DGs if required.

I'll be happy to hear your thoughts.

Thanks,

Yair

stinsong's picture

Yes, yariz. I believe after the sync task complete restart VCS (deport/import DG) will not show sync data again.smiley

Yasuhisa Ishikawa's picture

Please try to add Volume resource between Mount and DiskGroup resource. This must be added as a child of Mount and a parent of DiskGroup. Also set StartVolumes and StopVolumes attributes of DiskGroup resource to 0.

Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan

SOLUTION