Video Screencast Help
Symantec Appoints Michael A. Brown CEO. Learn more.
Storage & Clustering Community Blog

Commoditizing High Availability and Storage using Flexible Storage Sharing

Created: 27 May 2014 • Updated: 11 Jun 2014 • 2 comments
ccarrero's picture
+3 3 Votes
Login to vote

It was during this year VISION conference in Las Vegas when I got a very interesting question from Mike. He was running my Flexible Storage Sharing (FSS) lab and he asked me: “So Carlos, with FSS I can use internal HDDs and provide a highly available service, commoditizing the HW and avoiding any SAN need”. Then I started talking about the work I have being doing in the lab for the last months. And it was when I realized I had to start writing and publishing about that work, so here we go.

During FSS deployment we all were very excited about the capabilities to bring any application working close to the CPU, especially when using internal SSDs. We worked very closely with Intel and we published the white paper Remove the Rust: Unlock DAS and go SAN-Free. This white paper described how I could increase by four times the performance of a database. But what happens if I do not want to increase performance but just commoditize using internal storage and reduce my Total Cost of Ownership?

A few months back we got some new servers in the lab with 25 internal HDDs and one Flash Card inside the server. With FSS I could build an environment providing high availability for any service running on the top. Therefore my first step was to create a basic configuration where I could create something similar to what we did in the white paper mentioned above.

FSS_Mirror_2.png

The difference here is that I am going to be using internal HDDs for both data and redo logs. To accelerate the performance, an internal flash card will be used with SmartIO. I want to be very clear here, and the next comparison may not be fair, because in the SAN environment I had different servers, I had almost 6 times more SGA but also a bigger database (1.4TB versus 700GB).

But my goal was to see what would be the performance I could get from this new environment and compare it with something very similar we had done previously. The results are very interesting. In the SAN environment I was able to get around 81K transactions per minute. And just to be clear again, that was the best performance I got for many runs, where it was difficult to get the same performance twice. Our IT guys told me many times: “Carlos, you are not the only one using the SAN!”, while with my new environment I have been able to get 77K transactions per minute run after run.

SAN vs HDD.png

The interesting thing is that this new architecture is 60% cheaper (including hosts, storage, interconnects, SW licenses) than the SAN one. My next step was to grow it to three node cluster, where I can use my spare disks to have one database instance running in each of the servers. I will be describing that in the next article.

Carlos Carrero.-

 

Comments 2 CommentsJump to latest comment

mikebounds's picture

Hi Carlos, do you have any information on the hardware used - were you using high-end internal disks and 10GB Ethernet with RDMA.  When I posed question, I was more thinking, I meet customers who massively overspec their hardware and FSS could be a more appropriate solution.  

I worked somewhere last year that had 100 databases and there were a few, less than 10, that took advantage of being on fast disks provided by the SAN, but the other 90, had modest requirements, both in terms of CPU and I/O, but as I have seen many times before, the customer puts everything on high-end SAN and high-end servers.  The reason for this tends to be the databases are very important and need HA and if you need HA, then traditionally, you need to have your data on the SAN or NAS and some customers I meet don't implement different tiers of shared storage, so put all applications that need HA on high-end SAN disks.

You can buy blades that have 6 - 16 internal bays that come with built-in RAID6 and RAID10 controllers and with 1TB disks, you can have > 10TB of usable space (after using disks for parity and hot spares)

So you could have a blade with cheap internal SATA disks configured with RAID6 or RAID10 and then mirror this using FSS with another blade.  Some of these blades even come with 10GB Ethernet, but even with 1GB Ethernet, the I/O is not going to slow - in particular, for reads you can read the local plex which should out-perform NAS.

So FSS with lower-end disks could provide a cheaper solution for HA environments where the write performance sits somewhere between NAS and SAN and I suspect the read performance could actually meet or exceed that of the SAN, especially if you use an internal SSD with SmartIO.

Also note if you use RAID on the internal disks and then mirror using FSS where the 2 nodes are physically separate (anything from opposite side of same data centre to a few km away), this provides a high level of redundancy which exceeds that of using a single disk array, because even though a disk array will have redundant hardware inside, it is all physically together so at risk of local issues like fire and flood (or a leaky roof, which I have seen).

Mike

UK Symantec Consultant in VCS, GCO, SF, VVR, VxAT on Solaris, AIX, HP-ux, Linux & Windows

If this post has answered your question then please click on "Mark as solution" link below

+1
Login to vote
ccarrero's picture

Hey Mike! I really appreciate your comments. You got excellent points. For this specific test I used x86 servers with 25 x 300GB 15K RPM disks. As the diagram shows, I only used 6 disks for data and one for redo logs (in each server). At this particular scenario I used Volume Manager to create an internal stripe within the server for the data disks. As you pointed out, you also may want to use a controller for that. To accelerate reads I had one FusionIO card per server. The interconnect used is InfiniBand with RDMA. I am going to be using the rest of disk to hold other databases and I will push some results shortly.

We had some challenges and discussions when internally naming this new capability. Some people internally called it “shared nothing” for a while, but that was not a true name, as we still have the capabilities to shared data and combine it with traditional SAN. So finally we got the concept of “Flexible” and that is what this new capability is about. You defined it quite well. I may have some data on the SAN, but also I want to take advantages of the new internal storage capabilities. And you can do this in either direction. As you noticed in the lab and in your further testing, nothing really changes with traditional Cluster Volume Manager. We keep the same storage virtualization layer, but now you can combine it with local storage and SmartIO capability to accelerate performance.

While you have the use case of having some databases on commodity HDDs, we have a customer with the same scenario you mentioned. Only a few databases needed an extreme performance, while the others had normal performance requirements. So, why buying a very expensive system that only a few databases will get benefit of? Instead of that this customer used internal SSDs protected by FSS to have those databases working really close to the CPU, leaving the other databases in their normal SAN storage.

Thank you again.

Carlos.-

0
Login to vote