Just how complex has the typical data center become? According to one recent survey
, roughly two-thirds of data center managers said their data centers are too complex to manage easily. And if dealing with this complexity wasn't challenging enough, more than half of the managers said internal service-level-agreement demands are increasing.
At the same time, budgets continue to be constrained. According to the most recent Symantec State of the Data Center Report, data center budget growth has been "minimal" over the past five years.
Given these pressures, what can proactive IT managers do to cost-effectively ensure the availability and recovery of their mission-critical applications? This article explores some of the options.
Today, so-called High Availability and Disaster Recovery
(HA/DR) solutions need to address an unprecedented number of threats, ranging from component failure (e.g., a disk that runs out of space), to worms and viruses, to more widespread outages (such as a power failure), to catastrophic site outages (as a result, say, of a tornado).
But according to a recent Forrester Research survey, IT departments aren't adequately conveying that message to upper management. Forrester surveyed 250 disaster recovery professionals in October, 2007 and concluded that IT managers need to do a much better job of convincing business leaders to invest in disaster recovery systems.
According to Forrester Senior Analyst Stephanie Balaouras, managers must demonstrate that disaster recovery isn't simply an "insurance policy," but can actually boost operational efficiency by protecting systems against potential failures.
Balaouras says companies should consider disaster recovery investment as a "rolling upgrade" that consistently augments existing infrastructure and application investments rather than as a one-time event that can be delayed.
Other observers contend that recent natural disasters like Hurricane Katrina lend extra urgency to the argument for a robust disaster recovery strategy.
"The widespread confusion that followed Hurricane Katrina brought into sharp focus the need for comprehensive business continuity plans that incorporated secondary data center sites located far enough away so as to be untouched by the disaster affecting the primary data site," Dan Lamorena, a Senior Product Marketing Manager at Symantec, has written. "However, many IT organizations believe the costs involved in establishing secondary data centers are out of reach for all but the largest organizations."
Lamorena has proposed five strategies that can help enterprise IT organizations implement robust high availability and disaster strategies and, at the same time, maximize system availability for day-to-day operations.
- Identify problems immediately. Traditionally, one of the challenges in executing timely disaster recovery has been a slow alerting process. Today, thanks to advanced clustering technology, notification and reporting capabilities can pinpoint when an outage occurs and immediately notify administrators of a problem. Clustering technology can then start up applications at a secondary data center and connect users to that data center.
- Reduce downtime with automation. For mission-critical applications that demand minimal downtime, the disaster recovery process must be highly automated and resilient—such applications require an intelligent application recovery infrastructure. An automated approach, such as high availability clustering, eliminates significant downtime compared to a traditional manual recovery process. If a system fails in the primary data center, the software can restart the application automatically on another server, with limited action required by IT personnel.
- Exploit the potential of secondary sites. Most enterprise IT organizations view secondary sites strictly as cost centers, sitting idle much of the time. But secondary sites can be used for test development, quality assurance, or even less critical applications. In addition, advanced clustering software reduces the high cost of requiring applications to fail over to the identical hardware that the production applications run on. The most sophisticated clustering software permits fail-overs between different storage and server hardware within a data center or at remote sites.
- Realize value in virtual environments. Server virtualization has become mainstream technology in today's data center. Server virtualization employs virtual machine technology to allow multiple operating systems to be run on a single server. Restarting virtual servers at secondary sites has traditionally been a manual process. But new clustering software allows companies to deploy server virtualization technology and gain the same automated disaster recovery benefits they can expect in their physical server environments. Keep in mind too that using virtual servers as the fail-over targets for mission-critical applications reduces hardware/power/space costs at the recovery site.
- Regularly test the disaster recovery plan. A study conducted in October, 2007 by Forrester Research and the Disaster Recovery Journal found that 50% of companies test their disaster recovery plan just once a year, while 14% never test. Companies sometimes are reluctant to conduct DR testing because it means bringing down production systems and because it's labor-intensive. With automated fail-over capabilities, IT organizations can test recovery procedures using a copy of the production data – without interrupting production, corrupting data, or risking problems when restarting a production application. Disaster recovery best practices say that the disaster recovery plan must be tested, revised, and updated regularly. Symantec recommends that companies take this seriously and test their disaster recovery environment regularly in accordance with the needs of the business.
IT departments today are feeling increased pressure to improve their business continuity capabilities while containing costs. At the same time, businesses large and small view a greater percentage of their applications as "mission-critical." According to the Forrester Research/Disaster Recovery Journal study, respondents classified 35% of their applications as mission-critical. Failure to recover these essential applications in a timely fashion can have a devastating impact, resulting in lost revenue, compliance breaches, loss of customers, and a tarnished brand.
Today, a new generation of high availability and disaster recovery software is improving IT departments' ability to cost-effectively deliver maximum uptime and improve productivity. Moreover, these solutions provide IT with the tools to non-intrusively test and verify their recovery infrastructure in an automated, cost-effective manner. Given the catastrophic consequences of recent natural disasters, such solutions need to be on the radar screen of all IT departments.