configuration consistence across the nodes
In my practice most DR/HA solutions failed because of node's inconsistent configuration.
Everybody knows that application and server management before clustering and after really differs.
After clustering both application and server management should be correlated with the Cluster.
When you do something at one node, you also should do the same thing at the whole nodes of this cluster. For example when you add user to one node, you should add the same user at all other cluster nodes. Likewise, you should act with groups, projects, sudo or RBAC rules, and so on.
I think one of the very popular misconfiguration are users crontabs. Application admins often use crontab to run really critical tasks. Of cause than they edit cron, they don't do it at all cluster nodes. And one day, when this node fail, and application starts at another one node "strange" problems came until anybody recall about cron.
Nowadays I couldn't find any solution on the market, which can help and I'm not sure that such solution in general can be.
May be Disaster Recovery Advisor ( RecoverGuard) should do this, but doesn't.
Any way I think it will be great if VCS can help here.
I have wrote an agent, which checks any file, you specify to be consistent across all the nodes. But it can just notify if file is changed at one of the nodes.
Also I'm not sure that Agent is really the best solution here. May be it will be better to "lock" file, controlled by VCS. And if anybody would like to edit this file, he will unlock it and after editing "say" to VCS that this file must be changed at all VCS nodes and VCS will sync this file across all nodes.
May be there is any other much more useful solution.
But I'm pretty sure that if VCS will have such feature there will be much less failures.