Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.

VVR on Windows Performance/Latency

Created: 26 Jan 2011 • Updated: 31 Jan 2011 | 4 comments
sheldonl's picture
This issue has been solved. See solution.

Hello,

I'm setting up a system that requires an RPO-0 disaster recovery between two database clusters. Because the distance is over 400km between them I'm going to use VVR in a bunker configuration about 30km from the primary database cluster. Communications between the bunker and the primary DB cluster will be sychronous (in sync override mode) From Primary to secondary will be async.

Does anyone here have any experience with such a configuration? If so, can you provide any feedback? Especially, how much of a performance hit have you seen on average?

 

Thanks.

Comments 4 CommentsJump to latest comment

Wally_Heim's picture

Hi Sheldonl,

I/O at the primary is going to be highly dependant on the link speed to your bunker site and the latency between the primary and bunker.  Basically, all I/O on the primary have to be writen to the bunker SRL and ack back to the primary before the I/O is completed on the primary.

 

As for failover performance to the DR site, it is dependant on the link speed to the primary or the bunder site.  The higher the speed link to the DR site the faster failover will be if replication is in SRL mode.

 

I would recommend using VRAdviser that comes on the DVD to capture I/O disk stats on the primary for a couple of weeks.  Then you can use it to do what if situations and see what increasing and decreasing link speed will do for your expected failover/resync times.

 

Thanks,

Wally

sheldonl's picture

 

Thanks Wally. I've actually provisioned for a 1Gbps link. Vendor assures me 1ms latency each way, so I'm trying to estimate total latency added per I/O. If I have say 3ms each way once I'm done going through a switch and router then that's 6ms round trip. What I'm missing to add to this is the latency added by the SFW/VVR layer. I'd ultimately like to keep the additional latency under 10ms. 

I will run the VRAdviser, but my environment isn't built yet and my application hasn't bee prototyped yet either, we are building a new system and trying to model this so we can determine whether or not we can meet our SLA objectives. If anyone has some general information or rules of thumb to help it would be greatly appreciated.

Wally_Heim's picture

Hi Sheldonl,

 

If the environment and application are not created yet then VRAdvisor will not be able to get a sampling of the application I/O to be able to perform "what if" against.

 

It sounds like most of your concern is around throughput with VVR in place.  I would recommend setting up the configuration the same or very similar to what you will be using and run some I/O performance utilities against it to simulate expected I/O loads.  Standard I/O utilities on the primary will give you the numbers that you need for the primary and bunker site performance. 

 

You can also monitor VVR to see how far behind it is with the DR site. 

 

Running the following command on the primary will give you the % SRL log being used to the secondary and it should also calculate how far behind (in hours, minutes and seconds) the secondary is based on current replication charactoristics.

 

     vxrlink -i 5 status <rlink>

 

If you cannot setup VVR similar to your final configuration, then you can run I/O performance utilities at both the primary and bunker site and calculate a rough I/O latency that your application will see.

 

You will need to add the I/O latency at both the primary and bunker site with the 2 to 4 times the network latency.  This should get you close to the I/O performance that your application will see. 

 

Without knowing the I/O load it will be impossible to say if your link to the secondary site will be enough to keep is up to date or near up to date for your needs.

 

Thanks,

Wally

 

 

 

SOLUTION
mikebounds's picture

A few points:

  1. If your RPO is 0 then really replication should be configured as hard synchronous (synchronus=fail), otherwise you cannot guarantee your SLA (network could go down, meaning VVR switches to async and then you could loose your site).  But setting sync=fail, means your application will be down when the network is down which probably means you won't meet your RTO.
  2. I don't think VRadvisor will help you determine the latency, only the bandwidth and SRL size you require
  3. I don't believe VVR itself will add much latency, so the latency is mainly down to the roundtrip time of sending the data packets - one exception here, is if the bandwidth to your bunker site is not sufficent, where VVR will slow down your app.  Therefore your bandwidth needs to be large enough for you maximum throughput.  Note to determine your maximum throughput you may need to collect stats at very small intervals, as for example suppose you collect at 5 min intervals and 100MB is written in 5 mins, then you don't know if this evenly spread or that actually it wrote 90MB in 30 seconds and 10MB for the remaining 4.5 minutes.  
  4. If your bunker is up-to-date, then the latency and bandwidth to your DR site is irrelevent in terms of meeting your RPO, but the bandwidth will effect your RTO (how long it takes to update your DR site from the bunker).  You can use VRadvisor to size you bandwidth to DR to cope  with somewhere between average and maxium writes, depending on RTO.  Note, while the write throughput remains less than the bandwidth to the DR site, then DR should only be a second or two behind.
  5. If you are replicating a database then you should not replicate temporary tablespaces as this potentially can have a huge impact on the bandwidth you require and the RTO.  I have seen databases write 50-75% to temporary tablespaces.

Mike

UK Symantec Consultant in VCS, GCO, SF, VVR, VxAT on Solaris, AIX, HP-ux, Linux & Windows

If this post has answered your question then please click on "Mark as solution" link below