Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.

VVR for Solaris : Logs synchronization issue on failback to original primary

Created: 20 Feb 2012 • Updated: 09 Apr 2012 | 7 comments
data_guy's picture
This issue has been solved. See solution.

Hi,

I am facing a problem while failing back to original primary. I setup replication of Oracle data (binaries and database) between 2 nodes running Solaris 10. I tested replication and brought 2nd node up after failover (reversing the roles of primary and secondary nodes). Then I added a table in Oracle and allowed time to replicate changes back to original primary. But when I failback again to original primary with the same steps that I used for failover; Oracle throws error that "ora-00322  control file is not a current copy". Kindly advice if I am missing something or process of failback needs to be some different then actual faiover.

Thanks in advance for any expert opinions.

Discussion Filed Under:

Comments 7 CommentsJump to latest comment

mikebounds's picture

Can you the following additional details:

  1. How did you initially synchronise - did you use autosync?
  2. When you added table in Oracle, did you have to add space to diskgroup first or was this added to exising table space or diskgroup space
  3. You says you "allowed time to replicate" so can you confirm this means VVR is in asynchrohonus mode and that you waited and checked that VVR was up-to-date.
  4. When you are reversing replication are you doing a migrate (graceful reversal of roles while both nodes are up) or a takover (while one node is down) followed by a fastback resync (change old primary to secondary and resync changes)

If you are using CLI, can you give commands for above (I only need veritas commands so if step 2 is only Oracle steps then I don't need these)

Can you also provide output of:

 vxprint -VPl 

Mike

UK Symantec Consultant in VCS, GCO, SF, VVR, VxAT on Solaris, AIX, HP-ux, Linux & Windows

If this post has answered your question then please click on "Mark as solution" link below

g_lee's picture

What version of VVR are you using?

What commands did you use to failover from primary to secondary (and to fail back again)? What command (if any) did you use to verify the failback/migration back to the primary was complete / sync was up to date before attempting to start Oracle?

If this post has helped you, please vote or mark as solution

Gaurav Sangamnerkar's picture

VVR version & above asked information need to give a constructive feedback...

I believe the important part is how are you doing the failover.. using migrate/takeover ??

I have observed multiple times failing to a right failback sync can cause inconsistency..

hope below is read & followed:

Failing back using fast failback synchronization

https://sort.symantec.com/public/documents/sfha/5....

or

Failing back using difference-based synchronization

https://sort.symantec.com/public/documents/sfha/5....

Gaurav

PS: If you are happy with the answer provided, please mark the post as solution. You can do so by clicking link "Mark as Solution" below the answer provided.
 

data_guy's picture

Thanks Mike, g_lee and Gaurav for replies !

@ Mike:

1. Yes I used autosync.

2. Table was added in existing space. No additional space added for it.

3. Yes, replication is async and I verified that status is upto date.

4. No, I didnt use vradmin migrate or takeover. (I did failback with traditional commands to detach, disassociate, change roles, re-connect and verify replication. It worked fine for simple files replication.)

(I'll share vxprint output as I get a chance to access.)

@ Gaurav:

No, I didnt use takeover/migrate. Just used abovementioned procedure. Plz see commands below.

@ g_lee:

To verify replication compleeion I used "vxrlink -g dgY -i 2 status rlinkY" and it shown upto date. Following were my steps for failover/failback.

shut down oracle  on original primary node-A

# Unmount data volumes on primary node
umount /u01
umount /u02

# Stop primary rvg      ( to stop replication)
vxrvg -g dgX stop rvgX

#Detach primary rlink from original primary node-A diskgroup
vxrlink -g dgX det rlinkX

vxvol -g dgX dis srlX

vxedit -g dgX set primary=false rvgX

vxvol -g dgX aslog rvgX srlX

vxrvg -g dgX start rvgX

vxrlink -g dgX -f att rlinkX

# On secondary node  node-B (ready As a new Primary )
vxrvg -g dgY stop rvgY

vxrlink -g dgY det rlinkY

vxvol -g dgY dis srlY

vxedit -g dgY set primary=true rvgY

# To check if we are still cool?
vxedit -g dgY set remote_host=node-A local_host=node-B remote_dg=dgX remote_rlink=rlinkX rlinkY

vxedit -g dgY set primary_datavol=u03 u03
vxedit -g dgY set primary_datavol=u04 u04

vxvol -g dgY aslog rvgY srlY

vxrlink -g dgY -f att rlinkY

vxrvg -g dgY start rvgY

mount -o rw /dev/vx/dsk/dgY/u03 /u01
mount -o rw /dev/vx/dsk/dgY/u04 /u02

start Oracle
start the listener
change host name  in tnsnames.ora and listener.ora file

mikebounds's picture

Why are using such low-level commands?  I have been using VVR since it was introduced 10 yeas ago, when you HAD to use vxmake to create RDS because "vradmin createpri" did not exist, but even then you still used vradmin to do takeover or migrate.

Ocassionly VVR gets into a mess and you can't use vradmin, but then I have used lower level commands vxrvg and vxrlink to sort things out, I have not had to use even lower level commands vxedit to sort things out.  I have never used vxedit on RVG or datavols (for replication purposes), only on rlink where vradmin commands do not work if rlink is not connected so you have to use vxedit.

Hence I don't know what you mean by "traditional commands" - I have never seen this procedure used before.  I often do things the non-standard way, where it maybe more complicated, but it can be quicker if you know what you are doing, but using low level commands instead of vradmin is not one of them as it is quicker to use vradmin, but importantly I still know what is going on "under the bonnet"

If you corrupt your data, as you have, using low level commands then Symantec Support probably won't help you, so I would recommend using supported procedures:

To gracefully failover use:

 vradmin migrate

To failover when primary is down use:

 vradmin takeover

and if old primary comes back, then resync changes using:

 vradmin fbsync 

You can only use fbsync if you have DCMs (recommended), if you don't or you get into problems, then use:

vxrvg makeprimary

or

 vxrvg makesecondary

rather than using vxedit.

Mike

UK Symantec Consultant in VCS, GCO, SF, VVR, VxAT on Solaris, AIX, HP-ux, Linux & Windows

If this post has answered your question then please click on "Mark as solution" link below

SOLUTION
data_guy's picture

Thanks Mike.

Let me do it in standard way as advised (through vradmin); and share result in order to mark it as a solution.

sky410's picture

I agree with Mike Bounds' opinion

You need to check the 2 items.

Firstly, The mount point should be unmounted.

Secondly, The replication status should be up-to-date.

Finally, You'd better use the command "vradmin MIGRATE" instead of all other commands.

Thank you.