As we are all aware, information is expanding at a staggering rate. World data in 2010 was estimated at 1.2 zetabytes, expected to rise to 7.9ZB in 2015 and, in 2020, to 40ZB, with something like 30 billion connected devices in the next few years. Against that backdrop, there is really no way we have either the time or bandwidth (cost) to shift these large lumps of data around. Yet, potentially, they have enormous value to people who want to access them.
In this world of ‘Bigger Data’ – and what is rapidly becoming ‘Even Bigger Data’ – this presents a massive challenge for all of us: how do we supply Data as a Service, while still maintaining control?
Because the reality is that, in these data-driven times, everyone is going to have to consider themselves as a consumer and a provider of Data as a Service, and deal with all of the consequences this brings into play.
In the new world of mega-data, information that would once have been considered beyond the reach of many is now almost your everyday fare. Practically nothing is off the menu. All of which means that there are some very large datasets out there that people want to exploit, such as Facebook data, mobile and location data, census data, research data, geonome data, medical history… the list is daunting.
Amazon is a great case in point here. It is actually starting to offer public access to very large datasets. Amazon Web Services provides a centralised repository that can be integrated into AWS cloud-based applications. Like all AWS services, users pay only for the compute and storage they use for their own applications (1). However, consumers don’t want to copy the whole lot. So they are going to have to buy/ rent some Amazon EC2 to process the data at the Amazon location and just transport the results.
This highlights what lies at the crux of Data as a Service: simply offering the data is not enough. A level of infrastructure has to be provided, so customers can run apps against the data and then transport the results. This then leads into all the normal threads of how do you bill for access to that data: by processing time; by amount of data processed; by amount of metadata created?
Then there is the need to ensure that the highest levels of confidentiality are observed around the data. For instance, where medical data is concerned, one objective might be to understand the numbers of people who have contracted flu in a particular area, without compromising the integrity of that process. For example, if healthcare statistics are being offered out ‘as a Service’, you have to ensure the appropriate levels of anonymity have been put in place, so that, when combined with another data source – say, Facebook –individuals’ personal information is properly protected. And, of course, the smaller the dataset, the harder it is to retain that anonymity.
In tackling these challenges, organisations need the ability to effectively control factors within the organisation, knowing what and where their most important information is. Equally, Symantec’s goal is to help them understand user behaviour, determine risks and improve productivity. For customers, the result will be a stronger ‘information fabric’ – ie, the layer of metadata that is common across all data types that enables organisations to get better insight into their data, helping them to understand the information they have, it’s criticality to their business, while eliminating redundancy.
Also, Symantec is simplifying security by addressing the challenge of managing all the different solutions that companies are now investing in. Its Security as a Service solution, for instance, monitors both Symantec and third-party security products in the environment, to deliver the highest levels of protection moving forward, while Data Insight 4.0 is the latest version of Symantec's unstructured data governance solution, providing actionable intelligence into the ownership and usage of unstructured data, such as documents, presentations, spreadsheets and emails. Most importantly, Data Insight 4.0 provides new discovery, analysis and remediation capabilities to help organisations better reduce costs and risk, achieve compliance and gain insights into their unstructured data.
All of this serves to reinforce the fact that Bigger Data, in itself, has little value. What gives it its worth is the information analytics that are applied to it, in a secure environment. Get that formula right and your business will extract maximum payback from its operations, Get it wrong and it’s more likely to be your competitors enjoying those fruits.
Please also check out the latest blog from our CEO Steve Bennett on how we at Symantec are helping people, businesses, and governments protect and manage their information.