Video Screencast Help

one rvg going constant behind and dont get max bandwidth to replicate data 100%

Created: 09 Jul 2012 | 8 comments
Zahid.Haseeb's picture

Environment

Operating system version = Linux Redhat 6.2

Storage Foundation version = 6.0 with RP1

LinuxBox1 with two ethernets (primary Site)

Linuxbox2 with two ethernets (DR site)

One NIC(eth0) for users access and second NIC(eth1) is dedicated to replication

Diskgroup = one

volumes = two

replicated volume group = two (each replicated volume group for each volume)

bandwidth link = 25Mbps

Bandwidth limit set = 10Mbps

protocol = UDP

packet size = 1500

problem

I am able to see that one of my volume in the diskgroup can replicate properly and can utilize whole bandwidth for its replication if required and its timestamp under repstatus is behind 0 hours, 0 minutes and 0 seconds. As far as the the another volume is concern when I start replication to this volume under another replicated volume group can utilized max bandwidth as much as it can and after the result of behind 0 hours, 0 minutes and 0 seconds. when it complete synced few kilobytes always keep in SRL and at this time its not taking much bandwidth and the SRL is constently showing few kilobytes it. I can show an example of that RVG below regarding the repstatus and vrstat command . My problem is that why SRL always keeping few KB's although no one is accessing the machine. why the repstatus is not showing behind 0 hours, 0 minutes and 0 seconds

 

vradmin -g database-DG repstatus database-logs-rvg Mon Jul 9 18:29:28 2012

Replicated Data Set: database-logs-rvg
Primary:
  Host name:                  172.16.25.200
  RVG name:                   database-logs-rvg
  DG name:                    database-DG
  RVG state:                  enabled for I/O
  Data volumes:               1
  VSets:                      0
  SRL name:                   database-logs-srl
  SRL size:                   139.00 G
  Total secondaries:          1
Secondary:
  Host name:                  172.17.27.200
  RVG name:                   database-logs-rvg
  DG name:                    database-DG
  Data status:                consistent, behind
  Replication status:         replicating (connected)
  Current mode:               asynchronous
  Logging to:                 SRL ( 1738 Kbytes behind, 0 % full
  Timestamp Information:      behind by  1h 34m 6s

vrstat result shows the bandwidth utilization:

Bandwidth Utilization 32.00 Kbps.
Bandwidth Utilization 176.00 Kbps.
Bandwidth Utilization 40.00 Kbps.
Bandwidth Utilization 00.00 Kbps.

 

 

 

 

 

Discussion Filed Under:

Comments 8 CommentsJump to latest comment

mikebounds's picture

There will also be a trickle of traffic for information like timestamps, even when there is no writes, but this should catch-up so I don't know why the rlink is not up-to-date - this suggests a bug as this should not normally happen when there is sufficient bandwidth which there obvioulsy is.  I have seen this sort of thing before on 5.x on Solaris, where vradmin showed status of "replicating", but no data was been transferred and a MP fixed this, but your issue looks a bit different as some data is being transferred as vrstat shows some bandwidth use.  

From your desc and naming it sounds like that you MAY have created 2 RVGs in the same diskgroup, one for database logs and one for the datafiles - if this is the case, you should not do this as you must put the datafiles and logs in the same RVG for the database to stay consistent at the secondary.

Rebooting may resolve your problem, but I would log a call with Symantec as what you are seeing should not happen.

Mike

UK Symantec Consultant in VCS, GCO, SF, VVR, VxAT on Solaris, AIX, HP-ux, Linux & Windows

If this post has helped you, please vote or mark as solution

Gaurav Sangamnerkar's picture

How about the writes coming on primary site volumes ? could it be possible that writes are appearing constantly at certain speed (little higher to what is needed to achieve an upto date rlink) ... though very remote chances however would be worth to know ..

Though its a minimal data & believe should not cause any issues though just we are trying to know how to achieve the ideal situation ...

did u tried doing an iostat analysis or vxdmpstat analysis for the volumes & see if there is any IO floating across to the volume ?

 

G

PS: If you are happy with the answer provided, please mark the post as solution. You can do so by clicking link "Mark as Solution" below the answer provided.
 

Zahid.Haseeb's picture

Thanks mike for your kind words.

you MAY have created 2 RVGs in the same diskgroup

Yes you are right

one for database logs and one for the datafiles

Yes 

if this is the case, you should not do this as you must put the datafiles and logs in the same RVG

One thing I would like to add in the information which I have provided, that this is not an MS SQL Server or Oracle sort of data and log files. Its an in house made application which has exe files and other files related to application architecture in one volume and the history related log files in another volume. Sometimes clients dont want to replicate the logs related files so we have to stop them and application can run without these logs/history files. So if we keep both volumes in the same RVG then we cant stop the replication of log files volume..

you should not do this as you must put the datafiles and logs in the same RVG

Why you are more emphasising that both volumes must be in the same RVG ? I did this same thing with many other clients with 5x version and things are going good.

Any comment will be appreciated. Mark as Solution if your query is resolved
__________________
Thanks in Advance
Zahid Haseeb

zahidhaseeb.wordpress.com

Zahid.Haseeb's picture

@ G

did u tried doing an iostat analysis or vxdmpstat analysis for the volumes & see if there is any IO floating across to the volume ?

thanks for your kind comments G.

vxdmpstat ? No SAN disks here. Actually I have three local drives attached on both systems (primary and secondary site system).

first disk has linux 6.2 operating system and out from veritas storage foundation.

second disk = 300 GB

thrid disk = 300 GB

(I created a disk group on second and third disk. Made two volumes to be replicated on second disk and created srl volumes for both of these two volumes on the third disk.)

 

doing an iostat analysis

No I did not tried iostat

Any comment will be appreciated. Mark as Solution if your query is resolved
__________________
Thanks in Advance
Zahid Haseeb

zahidhaseeb.wordpress.com

mikebounds's picture

Volumes within an RVG are consistent with each other, so suppose in async mode you loose primary at 12:05 , then replication may be behind by one minute so database data file will be as it was at time 12:04 and the logs will be at the exact point in time of 12:04.  If data file and log file are in separate RVGs, then datafile may be 2 mins behind and logs 1 minute behind and this would mean you couldn't recover your database and your data is in effect corrupt as the volumes are not consistent.  If for your inhouse application the 2 volumes do not need to be at the same point it time then it is ok to put them in a separate RVG.

Mike

UK Symantec Consultant in VCS, GCO, SF, VVR, VxAT on Solaris, AIX, HP-ux, Linux & Windows

If this post has helped you, please vote or mark as solution

Zahid.Haseeb's picture

Thanks mike for you kind input. yes its an in house app. and log files cannot put a bad impact to other volume data

Any comment will be appreciated. Mark as Solution if your query is resolved
__________________
Thanks in Advance
Zahid Haseeb

zahidhaseeb.wordpress.com

Zahid.Haseeb's picture

I restarted the Secondary Site Nodeand it seems fix. Idont know why/How ...

Any comment will be appreciated. Mark as Solution if your query is resolved
__________________
Thanks in Advance
Zahid Haseeb

zahidhaseeb.wordpress.com

mikebounds's picture

As I said in previous post, I think this is a bug and I said rebooting may resolve your problem - so unfortunately, it is now hard for Symantec Support to see what the issue was.  This will probably occur again, so you should log a call next time before rebooting so Symantec can investigate.

Mike

UK Symantec Consultant in VCS, GCO, SF, VVR, VxAT on Solaris, AIX, HP-ux, Linux & Windows

If this post has helped you, please vote or mark as solution