Big Data – What’s It All About?
There has recently been a lot of noise in the IT fraternity about Big Data. It has the makings of the latest IT buzz word in the same way as cloud was so misused over the past few years.
Big data refers to a way of storing and analysing huge volumes of data to look for business trends And patterns in order to achieve some kind of business advantage. This could be a trading application, analysing who is buying and selling what, or it could be something as simple as regional selling patterns in a supermarket chain around the world. What these have in common is that a huge amount of data needs to be analysed to transform it into useful information.
One route many companies are taking is Hadoop. As it stands today Hadoop offers a quick fix for companies looking to triage masses of data to get started with their big data projects. The shine soon wears off as organisations realise that many of the same challenges exist in the big data world as in the traditional data place. They need better storage utilization, high availability, single pane of glass to manage from app to spindle, disaster recovery, snapshots, and seamlessly integration with backup solutions. Many organisations find that after some initial success they want to see if Hadoop can be integrated with their traditional OLTP solutions.
However companies are finding that what worked well as a project is possibly not a suitable for all their Enterprise needs for several reasons.
Firstly their are single points of failure today in the Hadoop architecture, the namenodes for example can be a problem from an availability perspective as well as a performance bottleneck. Additionally Hadoops filesystem can be problematic and failure prone(HDFS) What tends to happen is Hadoop architects spread the infrastructure across many devices becoming very inefficient and ultimately not meeting the need of the modern datacentre.
Some organisations are looking at building their own Big data with their existing infrastructure. The current open source Options are maturing slowly but a wise strategist may consider holding back or building on what they already own. Symantec have a rich history in producing heterogeneous products which help customers drive the best out of their hardware vendors by giving them the freedom to move between them without interruptions to service. Later this year Symantec’s will enter the Big Data market by releasing a API/connector which means it will be possible to use analytics like Mapreduce which need hadoops HDFS on Symantec’s own Clustered Filesystem. The advantage being that in addition to removing the flawed HDFS from the equation enterprises can also use their own SAN storage at the backend of their Big Data. Love it or hate it Big Data is here to stay, whether you think its a great way of filtering out junk to gain a competitive advantage in a crowded marketplace or a big hammer to crack a tiny nut the coming years will see a sea of players entering a potentially $100bn dollar market. Exciting times we live in...
~ Brian de la Pascua