Is the era of storage systems (arrays) facing disruption? Do the expensive monolithic chassis sellers need to find new ways to make money? Do the investors betting on newer storage array startups need to cash in now? Although it may feel unlikely in the near term, the perfect storm may not be that far away.
Let us think about how storage arrays came to solve problems for IT. There were two distinct transformations in this industry:
- In the beginning, storage systems originated as a way to offload computing requirements for RAID (Redundant Array of Independent Disks) from the host. In the days when hard drives had only a few gigabytes of capacity, data centers needed a purpose-built solution that had the computing power to implement RAID solutions on spinning disks. It was common to see such storage systems directly connected to servers hosting storage hungry workloads like database management systems.
- The arrival of Fibre Channel based Storage Area Networks (SANs) reinvigorated the market for storage systems. SANs made it possible for multiple hosts to share the same storage system. It brought down the capital and operating costs through storage consolidation while also opening the gates for value-added solutions like clustering across multiple nodes, offhost backups and multi-node file systems.
There are three innovation drivers in the storage system industry. The first is the performance throughput and capacity of the underlying storage unit, the disk. The second is the interconnect bandwidth between host and storage system. The last is the value added storage management system (the storage array controller and its software).
At this point, the disks are getting faster in performance, smaller in form factor and larger in capacity. In the 1990s we saw disks with a few gigabytes of capacity. Now disks are typically a few terabytes. However, the mechanical components have pushed the hard drive to 15k RPM limit after which data integrity becomes questionable. Now Solid State Disks (SSDs) and flash have increased the performance expectations without the limits of mechanical components; however SSD capacity is quite limiting and hence cannot be simply replaced to serve the storage needs.
The interconnect bandwidth has outperformed hard disk capacity and throughput. From 40Mbps for SCSI-1 interconnects, now we are at multiple 10Gbps across compute nodes. Thus, we have 250 to 1000x faster interconnects! The arrays today can serve several petabytes from a given interconnect to multiple hosts.
In order to solve the throughput limitations of the spinning disks, several startups appeared with innovative solutions. Generally, the techniques are in using SSD/Flash in conjunction with hard disks or instead of hard disks. The SSD/Flash layer acts as the cache for the host and software in the array writes to spinning-disks asynchronously. Or they replace hard drives with SSD entirely and solve the capacity limitations using software (e.g. use deduplication).
While these are indeed good solutions that may prolong the life of storage systems, there are three other trends working against purpose built storage systems. A perfect storm from these three waves may devalue the role of storage systems in data centers when organizations demand agility and simplicity for IT workloads:
- Web Scale IT companies proving that paying a premium for storage hardware is unnecessary. What have we learned from Amazon, Facebook, e-Bay? These companies have proven that it is cheaper and operationally efficient to solve the storage needs through software on top of commodity hardware.
- The dawn of Software Defined Storage. For organizations that do not have the skills and resources needed to build their own storage management solutions, there are vendors in the market providing software-based solutions to extrapolate the most out of storage in commodity hardware. Two prominent examples are Symantec with its Storage Foundation 6.1, delivering “shared-nothing” SAN and VMware with its vSphere 5.5 providing VSAN. These solutions make it possible to simply interconnect multiple x86 systems and provide RAID capable scale-out storage with single name space across all the nodes. Furthermore, enterprise grade features like SmartIO and Flexible Storage Sharing make it possible to effectively make use of SSD/Flash for its performance while relying on cost-effective hard drives for scaling up storage.
- Direct buying opportunities from Original Device Manufactures (ODM). Why pay a premium for a branded solution from EMC, IBM, HP, NetApp etc. that works only within a box while you can buy general purpose x86 hardware with enough bays for hard drives and SSDs from an ODM? You can use these to build your own storage solution (the web scale IT way) or use storage management vendors like Symantec/VMware for ready-to-deploy solution.
If the perfect storm becomes the reality, the modern data centers will have no place for purpose-built storage systems. The current expensive monolithic chassis-based storage vendors and upstarts need to disrupt their own solutions for survival. Vendors with software-defined solutions would dominate the mainstream. The journey to an agile data center would require redefining the way storage is delivered and consumed. It is not sufficient to redefine marketing messages to sell the same wine in a new bottle; something we are starting to see from incumbents.