Average storage utilization paints an incomplete picture
Some of the data I plan to analyze in this blog involve scale/size, including utilization. I've heard industry analysts and experts cite storage utilizations rates around 40%; Nicholas Carr recently spoke at Symantec and noted that at roughly 30%, storage was much more utilized than CPU (~12%). Since running a production website, I've gained a visceral awareness of what I previously knew conceptually. Namely, storage utilization is a tradeoff between two goals: keeping spare capacity in case it is needed, and storage cost.
I analyzed aggregate data uploaded to SORT by customers to look deeper into utilization. This analysis, of file system utilization, suggests that only measuring average utilization is misleading. By looking at the overall distribution, a richer picture emerges.
As seen in Figure 1 below, storage utilization is bimodal, i.e., it has two distinct peaks, one at 1% and one at 100%. In the context of storage utilization, this has important implications.

Figure 1. Average utilization is 57%, higher than consensus results; possibly because VxFS achieves a higher utilization than the industry average, possibly due to workload mix, possibly because of different methodologies.
In this set of systems, there are many ways to raise utilization from 57% to 60%. Consider three:
1. increase utilization by 5% for every file system
2. increase utilization to 100%, but only for file systems that are currently above 75% utilized
3. increase utilization to 25%, but only for file systems that are currently below 25% utilized
The first option requires reclaiming space on every file system; that’s a lot of work. The second option touches fewer systems, but adds risk by removing the free space buffer – if an application goes offline because it runs out of space, most would consider the storage saved a false economy. The third option minimizes the work, but adds little risk: 75% of the space remains free.
To increase storage utilization, recall the words of bank robber Willie Sutton. When asked why he robbed banks, he said: “Because that’s where the money is.” When aiming to improve storage utilization, start with the file systems with very low utilization.
While a full discussion is too long for this forum, a couple of links on reclaiming storage: are migrating to thin storage using SmartMove, or shrinking filesystems & volumes to free LUNs.
The Storage & Clustering Community Blog is the perfect place to share short, timely insights including product tips, news and other information relevant to the Storage & Clustering community. Any authenticated Connect member can contribute to this blog.