Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.

SRL overflow.

Created: 24 Feb 2014 • Updated: 20 Mar 2014 | 3 comments
This issue has been solved. See solution.

Hello,

We have a Disaster recovery setup of 2 node (solaris) cluster at prod and 1 node at the DR.  This setup uses VVR for data replication and VCS Geo-cluster for resource group failover (SFHA 6.0). 

My query was regarding VVR,

1.  In case of async replication, whats the frequency of replication of data, and can this frequncy be tuned?

2.   In case we loose the network connectivity between the prod and DR site and the SRL fills up and also the DCM fills up after that, what is the effect on the existing application service group for which the data is being replicated?  Will it stop the write to the existing application or is the application service group offlined or is it application performance affected in case the application is still working.

3.  Once the link/connectivity is recovered (between the prod and the DR site), Would it require a full-resync of the RVG for the data to be in sync?

Operating Systems:

Comments 3 CommentsJump to latest comment

Gaurav Sangamnerkar's picture

My responses below

1.  In case of async replication, whats the frequency of replication of data, and can this frequncy be tuned?

>> Replication will keep happening depending on data getting written to volumes. If the data is continuously getting written by application to data volumes, a copy will go to SRL vol & same will be replicated to DR site. To my knowledge you can't tune this. There are other parameters which can be controlled though like bandwidth allocation. Out of curiosity, why would you like to control replication at this level ?

2.   In case we loose the network connectivity between the prod and DR site and the SRL fills up and also the DCM fills up after that, what is the effect on the existing application service group for which the data is being replicated?  Will it stop the write to the existing application or is the application service group offlined or is it application performance affected in case the application is still working.

>> Firstly, once SRL fills up, DCM comes into play. DCM is nothing but data change map. It will keep track of the blocks where data has been changed. So even if SRL is full, DCM will keep hold of all the changed blocks, you won't need to worry for DCM getting full. Only thing you need to take care that DCM log is associated with data volumes & is healthy enough in size. If in case something happens to DCM, it won't impact your application groups as write will keep happening to data volumes, only thing is replication will break. Performance of server shouldn't be ideally impacted as well however this may depend on what your server runs.

3.  Once the link/connectivity is recovered (between the prod and the DR site), Would it require a full-resync of the RVG for the data to be in sync?

>>> if DCM is still active (after SRL overflow), you will need to flush DCM to DR site, a full resync is not required. This can be achieved by triggereing a "vradmin -g <diskgroup> resync <rvg>" command.

If in case replication breaks in between where rlink status is disturbed on Primary, you may need to trigger a full sync to attach the rlink back

G

PS: If you are happy with the answer provided, please mark the post as solution. You can do so by clicking link "Mark as Solution" below the answer provided.
 

SOLUTION
mikebounds's picture

VVR writes contintually, NOT periodically - the only way to write periodically would be to pause and resume replication in a schedular like cron - this is not recommended.  You can control replication using Latency protection, so that you control how far VVR can get behind - see latencyprot attribute in the VVR admin guide 

The DCM cannot flll up as it is simply a bit map representing unsynced regions of disks.  When your SRL fills, performance will actually increase, not decrease.  This is because when the SRL fills to a level higher than than the vol_rvio_maxpool_sz (default 128MB) which is the memory used for incoming writes, then to drain the SRL the writes are read from the SRL volume, rather than memory and continually reading from the SRL effects the write performance to your SRL which effects your application as writes to your application are not acknowledged until they are written to the SRL.  So when your SRL is above vol_rvio_maxpool_sz and not full, writes are impacted, but when the SRL fills, replication stops and so the SRL is no longer read and so your application writes are not impacted.

Full resync is not required when SRL fills - only marked regions in the DCM.

Mike

UK Symantec Consultant in VCS, GCO, SF, VVR, VxAT on Solaris, AIX, HP-ux, Linux & Windows

If this post has answered your question then please click on "Mark as solution" link below