Fixing Hadoop's Single Point of Failure With Veritas Cluster Server
Hadoop is an open source solution from Apache for managing and analyzing Big Data. Its scale-out architecture enables analyzing large volumes of data to find key business insights. Enterprises are turning to Hadoop for its agility and flexibility in data analysis. Hadoop enables enterprises make sense of varieties of data - structured to unstructured - and ask insight questions they were not able to do with traditional tools.
However Hadoop's Distributed File System (HDFS) has a weakness that makes it unattractive to enterprise datacenters. HDFS's meta data server, NameNode, is a single point of failure. When NameNode fails, applications lose access to data stored in many different DataNodes. In a long running analytical application a single failure can prevent enterprises from getting timely business insights.
Symantec's recommendation is to completely eliminate this flaw with a solution we are working on for enterprise ready Hadoop by making it highly available.
We recognize there are many work loads where enterprises prefer the scale-out model using many different commodity compute nodes with internal disks. But you do not have to settle for the single point of failure nor roll-out a solution that is not highly available. With our Veritas Cluster Server (VCS) you can make NameNode highly available and recover from failures in timely manner without operator intervention and without losing out on results.
Have an Hadoop installation with commodity nodes? Make it enterprise ready with VCS. See the detailed blog post our engineering team wrote on setting up NameNode in HA configuration using VCS.
Comments
Hey Rags, To save me from
Hey Rags,
To save me from doing the background reading, would such an agent save the user from any data loss while the nameserver fails over?
Tim
Tim Hanchon
Distinguished Engineer, Symantec
NameNode Failover with VCS
Tim
Look at it this way - NameNode failure will result in loss of access to the cluster. It is the DataNodes that have the data and run the tasks. Until Passive takes over, applications will lose access to the cluster. For example not able to complete MapReduce jobs. There is no data loss otherwise.
Fixing NameNode SPOF is about increasing cluster availability.
-Rags
Would you like to reply?
Login or Register to post your comment.