After a reboot, a node in a VERITAS Cluster Server (VCS) environment is in an ADMIN_WAIT state or in a STALE_ADMIN_WAIT state

Article:TECH6072  |  Created: 2009-01-27  |  Updated: 2009-01-27  |  Article URL http://www.symantec.com/docs/TECH6072
Article Type
Technical Solution

Product(s)

Environment

Issue



After a reboot, a node in a VERITAS Cluster Server (VCS) environment is in an ADMIN_WAIT state or in a STALE_ADMIN_WAIT state

Solution



Below are descriptions of the states that a Cluster Server node could end up in after a reboot as seen from the following command:

# hastatus
attempting to connect....connected

group           resource             system          message
--------------- -------------------- --------------- --------------------
                                    sptsunvcs3      STALE ADMIN WAIT: all system stale
                                    sptsunvcs4      STALE ADMIN WAIT: all system stale


ADMIN_WAIT state:

If VCS is started on a system with a valid configuration file, and if other systems are in the ADMIN_WAIT state, the new system transitions to the ADMIN_WAIT state.
    INITING===>CURRENT_DISCOVER_WAIT===>ADMIN_WAIT

If VCS is started on a system with a stale configuration file, and if other systems are in the ADMIN_WAIT state, the new system transitions to the ADMIN_WAIT state.
    INITING===>STALE_DISCOVER_WAIT===>ADMIN_WAIT

STALE_ADMIN_WAIT state:

If VERITAS Cluster Server is started on a system with a stale configuration file, and if all other systems are in STALE_ADMIN_WAIT state, the system transitions to the STALE_ADMIN_WAIT state as shown below. A system stays in this state until another system with a valid configuration file is started, or when the command hasys -force is issued.
    INITING===>STALE_DISCOVER_WAIT===>STALE_ADMIN_WAIT

Resolution:

If all systems are in STALE_ADMIN_WAIT or ADMIN_WAIT, first validate the configuration file (/etc/VRTSvcs/conf/config/main.cf) on all systems in the cluster by running the  'hacf -verify .' command for syntax error check (ensure that this command is run in the directory containing the main.cf file), and reviewing its contents for proper resource and service group definitions.  
Then enter  the following command on the system with the correct configuration file to force start VCS.

    # hasys -force system_name


    This will have the effect of starting Cluster Server on that node and starting Cluster Server running on all other nodes in the ADMIN_WAIT or STALE_ADMIN_WAIT state.

    One of the most common causes of a node being in one of these states is the existence of /etc/VRTSvcs/conf/config/.stale. This file is typically left behind if Cluster Server is stopped while the configuration is still open, i.e. someone has forgotten to save changes made to a running main.cf configuration. The .stale file is deleted automatically if changes are correctly saved and will therefore not force the relevant node into an ADMIN state when it next has to restart Cluster Server. As indicated earlier, the file can be safely removed if the main.cf file is known to be ok.



Legacy ID



199462


Article URL http://www.symantec.com/docs/TECH6072


Terms of use for this information are found in Legal Notices