Video Screencast Help
Search Video Help Close Back
to help

Fixing Hadoop's Single Point of Failure With Veritas Cluster Server

Created: 15 May 2012 | Updated: 01 Jun 2012 | 2 comments
Rags Srinivasan's picture
+3 3 Votes
Login to vote

Hadoop is an open source solution from Apache for managing and analyzing Big Data. Its scale-out architecture enables analyzing large volumes of data to find key business insights. Enterprises are turning to Hadoop for its agility and flexibility in data analysis. Hadoop enables enterprises make sense of varieties of data - structured to unstructured - and ask insight questions they were not able to do with traditional tools.

However Hadoop's Distributed File System (HDFS) has a weakness that makes it unattractive to enterprise datacenters. HDFS's meta data server, NameNode, is a single point of failure. When NameNode fails, applications lose access to data stored in many different DataNodes. In a long running analytical application a single failure can prevent enterprises from getting timely business insights.

Symantec's recommendation is to completely eliminate this flaw with a solution we are working on for enterprise ready Hadoop by making it highly available.

We recognize there are many work loads where enterprises prefer the scale-out model using many different commodity compute nodes with internal disks. But you do not have to settle for the single point of failure nor roll-out a solution that is not highly available. With our Veritas Cluster Server (VCS)  you can make NameNode highly available and recover from failures in timely manner without operator intervention and without losing out on results.

Have an Hadoop installation with commodity nodes? Make it enterprise ready with VCS.  See the detailed blog post our engineering team wrote on setting up NameNode in HA configuration using VCS.

Comments 2 CommentsJump to latest comment

thanchon's picture

Hey Rags,

To save me from doing the background reading, would such an agent save the user from any data loss while the nameserver fails over? 

Tim

Tim Hanchon
Distinguished Engineer, Symantec

0
Login to vote
Rags Srinivasan's picture

Tim

Look at it this way - NameNode failure will result in loss of access to the cluster. It is the DataNodes that have the data and run the tasks. Until Passive takes over,  applications will lose access to the cluster.  For example not able to complete MapReduce jobs. There is no data loss otherwise.

Fixing NameNode SPOF is about increasing cluster availability.

 

-Rags

0
Login to vote