Understanding Veritas Cluster Server Limits and Prerequisites
A very useful - yet often overlooked - feature of Veritas Cluster Server is Limits and Prerequisites. This feature is often used in conjunction with Service Group Workload Management (SGWM), but can also be implemented on its own. In this post, I'll describe what this feature does and how you can put it to use.
Limits and Prerequisites are attributes in VCS. Limits are system attributes applied to the cluster member nodes, while Prerequisites are attributes applied to service groups. Both are key/value type attributes defined by the user. To better understand how these two attributes work together, it's best to use a common scenario as an example.
Let's say I have a VCS cluster consisting of four nodes and four Oracle database service groups. Let's assume that (1) each node is provisioned identically with the same processor and memory power and (2) each service group places about the same amount of load on each system. Implementing SGWM will automatically keep the system load on each node in the cluster as evenly distributed as possible, but SGWM places a "soft" load limit on systems. In the event of multiple failures, VCS may exceed the load limit I've defined for each system. This is where Limits and Prerequisites come into play.
In this scenario, I've got enough "horsepower" on each system to comfortably run any two of my Oracle service groups. I might be able to "redline" a system at three service groups if necessary, but the Load and Capacity values I've defined to SGWM allow for up to two. If one system faults, then one of the other systems will pick up its load and be running two Oracle instances. So far so good. But if I suffer an additional failure before the first system is returned to service, then the two remaining systems will each be hosting two service groups.
In the unlikely event that I suffer yet another failure (maybe it's time for a new hardware vendor?), the last surviving node will be forced to try to host all four Oracle instances. At best, all four of my database instances will be seriously underperforming. Worse still, the load might bring that last system down completely. Limits and Prerequisites were designed to prevent just this type of scenario.
Since I've concluded that three is the maximum number of databases any one of my systems can support, I set the Limits attribute for each of the systems to "Oracle=3". The limit name I assign can be anything, but it helps to choose a name that makes contextual sense. For each of the Oracle service groups, I set the Prerequisites value to "Oracle=1". When one of the Oracle service groups goes online on a node, VCS internally decrements the Oracle limits value by the service group's corresponding Prerequisites value. Once that internal value reaches zero, VCS will prevent any additional service groups with Oracle=anything greater than zero from going online on that node.
In the scenario I described above with three failed systems, I'll still have three of my Oracle instances running on the one remaining node, but I'll have to get one of those failed systems back online in order to have all four instances back up and running.
Limits and Prerequisites...learn 'em and love 'em!