Cluster Network Architectures – Time to Share Nothing?
Having worked with Clustering for nearly fifteen years I believe this still qualifies me as a total novice. It’s a bit like saying I work with cars, if I drive formula one car I won’t have much knowledge about stock car racing. When we talk about cluster computing it normally refers to a number of computers working in some coordinated fashion. These will typically fall into two types, shared disk and shared nothing.
Shared disk is the most recognized architecture and as the name would suggest simply means that all storage is available to all nodes in the cluster. Examples of this could be Oracle RAC or Storage Foundation Cluster Filesystem. In both these instances a lock manager is required to manage to the coordinated access to the data. Shared disk architecture offers the highest levels of availability. Depending on the application it can often scale very badly in true parallel shared clusters. I often see this in Oracle RAC environments where due to the cache kept on each node the database is constantly chasing memory segments from other cluster nodes severely impacting database performance. Shared disk is normally the most costly and widely used of the architectures.
Shared nothing is another cluster mechanism which exists. Shared nothing does not involve concurrent access to disks, so no lock manager is required. Some would suggest that VERITAS Cluster Server could be considered a shared nothing technology since only one node in the cluster can access the data at a time. This is true unless you are using Veritas ClusterFilesystem. Ultimately though you are using shared storage which is presented to all hosts in the cluster. True shared nothing eliminates the need for expensive SAN shared storage and acts as an enabler for leveraging non shared storage. For example if I bought an X86 server with two internal drives, I could have my Operating system on one and an application on the other. This however offers no resiliency in the face of hardware failures of either server or storage. Using a shared nothing approach we could mirror our application data to one or many servers in our cluster. Those servers do not need to see the application data disk (not shared) the data will simply be replicated over the cluster interconnects and a synchronous copy will appear on the passive servers. Thus removing the need for expensive shared SAN storage. If you then consider a solution like VERITAS Cluster Server now you are able to monitor the application and if there was a failure of disk/server or application then control could be passed to one of the passive servers to restart the application with a consistent up to date copy of the data.
Storage Foundation High Availability 6.1 is due out in October 2013 will make shared nothing clustering as described here a reality. There is already the building blocks in place which enable “data shipping” across the cluster interconnects. So for example in the event of loss of storage connectivity of a node, that node could continue to process requests and write the data over the interconnects on another server. This is the underlying technology that the shared nothing implementation uses. VERITAS Cluster Server today offers a standardized approach to application availability with the traditional active passive approach. Storage Foundation HA will offer enterprises a very cost effective simple design to making their application highly available without any of the associated storage costs. As with all forms Business Continuity it is important to understand the service levels you need to maintain and map the solution accordingly. I would anticipate that the shared nothing approach may not fit the 99999 applications but will certainly be a good fit for less critical applications. The shared nothing implementation is branded Shared Storage Sharing and will available in October this year.