I could point readers at the myriad of articles on a certain lack of certification for x86 hypervisors or open up the proverbial can of worms by digging into the complex details of licensing or support policies.....but I'll resist that temptation for now as there are actually some technical concerns that would prevent me from running production databases on an x86 hypervisor platform.
Thinking about it logically, databases have long been designed to manage memory allocations and features such as AMM (Automatic Memory Management) were designed around the principle of having dedicated RAM managed by database technologies. In a virtualised environment, hypervisors have their own memory management tools which are largely designed around the principal that the sum total of the virtual machine memory will exceed that of the physical memory in the server so features such as memory reservations, shares and "balooning" are designed to enable over-allocation of physical memory to guest virtual machines and allow the hypervisor to worry about managing memory allocations.
This leads to my first concern - where should I enable the memory management features as the database memory management is unaware of the hypervisor memory management and vice-versa so there is no realistic way to co-ordinate the technologies to enable optimal dynamic memory allocation across database instances. If I allow the database layer to manage memory, then I need to offset this decision by setting static memory reservations from the physical to virtual server which in turn has an adverse impact on consolidation ratios and agility while also adding to the operational complexities of manually maintaining hypervisor to guest memory allocation, complicated placement policies for virtual machines etc.
If I enable memory management at the hypervisor layer, then what performance impact can I expect at the database layer? How can I prevent my SGA ending up on physical disk and destroying the performance of my production databases for instance?
In parallel to the memory allocation challenges, I also face CPU design and allocation challenges. Mapping of virtual CPU's to physical CPU's involves considerations such as hyper-threading, optimising to reduce "ready time" on the physical CPU and the best advice I've been able to find on this is "start with one virtual CPU and see how things go" - this doesn't exactly give me confidence that I can achieve the performance I want when I need it!
But even assuming I can resolve my memory and CPU allocation challenges, I'm left with the small issue of the storage I/O......most x86 hypervisors now automatically track align their volumes which eliminates one previously notorius headache but the issue of storage I/O optimisation is far from resolved. I recently attended a technical session from a major x86 hypervisor vendor where the presenter announced quite happily "Storage vendors hate us because of randomised I/O".
Interesting point I thought! If I run more than one virtual machine on a physical server, the I/O pattern to to the underlying storage subsystem must be highly randomised which has a negative impact on write performance as well as the array prefetch cache mechanisms, therefore slowing reads. Also, I started wondering how I should stripe my LUNs to try speed up the I/O coming down from a virtual machine through the hypervisor. What multi-pathing policies would help me out here? If I throw something like Oracle's ASM into the mix (which itself randomises I/O to the hypervisor filesystem before the hypervisor randomises the I/O to the array) I get pretty confused and am struggling to find a solution which allows me to undertand how all this fits together and how I can accurately carry out any form of capacity planning or perfomance tuning other than a time consuming and expensive trial and error approach......which could quickly become obsolete if any of my requirements change.
If I ever did figure out how to tune for performance, I then need to think about high availability and disaster recovery and the additional complexitites that these introduce.
All of this thinking started to hurt so I came up with the solution of allocating a single virtual machine for each physical server, disabling as many dynamic resource allocation features as I could think of at the hypervisor layer and essentially running my virtual machine to leverage the full physical hardware resources in the same way that I would if I had no hypervisor in place.......which got me to thinking - "Should I really virtualise production databases??"
The alternative is to use tried and tested deployment architectures of running database instances on physical servers, leveraging technologies such as Storage Foundation HA from Symantec to provide high-performance, risk averse compute environments for my production databases and using the database instance as the unit of granularity for movement across physical servers rather than needing to tie an entire OS instance to this.
Don't get me wrong, I'm a big advocate of x86 virtualisation and the advantages that it brings but for now I'll be keeping my production databases where I feel they belong rather than virtualising for virtualisations sake.