Apparent hang of replicated file systems, stop or pause clears hang

Article:TECH158329  |  Created: 2011-04-19  |  Updated: 2012-01-05  |  Article URL http://www.symantec.com/docs/TECH158329
Article Type
Technical Solution


Problem



Intermittent or persistent hangs in a replicated file system can cause upper-level applications (such as a database) to fail due to a timeout. This may be caused by the latency protection feature, if the high_water_mark and low_water_mark settings are too far apart and the network has a low bandwidth relative to the actual write-rates generated by the upper-level application. 


Environment



VERITAS Volume Replicator (VVR) with latency protection enabled in an Asynchronous Replication configuration.


Solution



If the configuration is not a synchronous replication configuration, the latency protection feature is not advised (contrary to the implied advice in the Admin Guide) and should be left OFF.  If used as adviced in the Admin Guide, the high_water_mark and low_water_mark values ( value is 'number of updates') should be spaced closely as a slow link could cause the SRL to drain slowly and be unable to reach the low_water_mark in time to prevent an application timeout.

Stopping or pausing the replication will clear the apparent hang, but the correct solution is to not enable the latency protection feature, or experiment with high_water_mark and low_water_mark values closer to eachother.




Article URL http://www.symantec.com/docs/TECH158329


Terms of use for this information are found in Legal Notices