SQL & Exchange performance in a Virtual Machine

The below is something I see far to often: An SQL or Exchange virtual machine using a single LSI Logic SAS virtual SCSI controller.

LSIlogic

What is even worse is a virtual machine using a single LSI controller and a single virtual disk for one or more databases and logs (as shown above).

Why is this so common?

Probably because the LSI Logic SAS controller is the default for Windows 2008/2012 virtual machines and additional SCSI controllers are not automatically added until you have more than 16 virtual disks for a single VM.

Why is this a problem?

The LSI controller has a queue depth limit of 128, compared to the default limit for PVSCSI which is 256, however it can be tuned to 1024 for higher performance requirements.

As a result, the a configuration with a single LSI controller and/or a limited number of virtual disks can artificially significantly constrain the underlying storage from delivering the performance it is capable of.

Another problem with the LSI controller is the amount of CPU it uses is higher than the PVSCSI controller for the same IO levels. This means you’re wasting virtual machine (and the underlying hosts) CPU resources unnecessarily.

Using more CPU could lead to other problems such as CPU Ready which can also lead to reduced performance.

A colleague and friend of mine, Michael Webster wrote a great post titled: Performance Issues Due To Virtual SCSI Device Queue Depths where he shows the performance difference between SATA, LSI and PVSCSI controllers. I highly recommend having a read of this post.

What is the solution?

Using multiple Paravirtual (PVSCSI) adapters with virtual disks evenly spread over the four controllers for Windows virtual machines is a no brainer!

This results in:

  1. Higher default queue depth
  2. Lower CPU overheads
  3. Higher potential performance

How do I configure this?

It’s fairly straight forward, but don’t just change the LSI Controller too PVSCSI as the Guest OS may not have the driver installed which will result in the VM failing to boot.

Too avoid this, simply edit the virtual machine and add a new Virtual Disk of any size and for the virtual device node, select SCSI (1:0) and follow the prompts.

VirtualDiskSCSI10

Once the new virtual disk is added you should see a new LSI Logic SAS SCSI controller is added as shown below.

NewLSIController

Next highlight the adapter and select “Change Type” in the top right hand corner of the window and select Paravirtual. Once this is complete you should see similar to the below:

AddPVSCSIController

Next hit “Ok” and the new Controller and virtual disk will be added to the VM.

Now we open the console of the VM and open Compute Management and goto Device Manager. Under Storage Controllers you should now see VMware PVSCSI Controller as shown below.

DeviceManagerPVSCSI

Now we are safe to Shutdown the VM.

Once the VM is shutdown, Edit the VM setting and highlight the SCSI Controller 0 and select Change Type as we did earlier and select Paravirtual. Once this is done you will see the original controller is replaced with a new controller.

ChangeLSItoPVSCSI

Now that we have the boot drive change to PVSCSI, we can now balance the data drives across up to four PVSCSI controllers for maximum performance.

To do this, simply highlight a Virtual Disk and drop down the Virtual Device Node and select SCSI (1:0) or any other available slot on the SCSI (1:x) controller.

ChangeControllerID

After doing this you will see new SCSI controllers appear and you need to change these to Paravirtual as we have done to the first controller.

ChangeControllerIDMultipleVdisks

For each of the virtual disks, ensure they are placed evenly across the PVSCSI controllers. For example, if you have a VM with eight virtual disks plus the OS disk, it should look like this:

Virtual Disk 1 (OS) : SCSI (0:0)
Virtual Disk 2 (OS) : SCSI (0:1)
Virtual Disk 3 (OS) : SCSI (1:0)
Virtual Disk 4 (OS) : SCSI (1:1)
Virtual Disk 5 (OS) : SCSI (2:0)
Virtual Disk 6 (OS) : SCSI (2:1)
Virtual Disk 7 (OS) : SCSI (3:0)
Virtual Disk 8 (OS) : SCSI (3:1)
Virtual Disk 9 (OS) : SCSI (0:2)

This results in two data virtual disks per PVSCSI controller which evenly distributes IO across all controllers with the exception being first controller (SCSI 0) also hosting the OS drive.

What if I have problems?

On occasions I have seen problems with this process which has resulted in VMs not booting, however these issues are easy to fix.

If your VM fails to boot with a message like “Operating System not found”, I suggest you panic! Just kidding, this is typically just the boot order of the Virtual machine has been screwed up. Just go into the bios and check the boot order has the PVSCSI controller showing and the correct virtual disk in first priority.

If the VM boots and BSOD or crashes and goes into a continuous reboot loop then power off the VM and set the first SCSI controller where the boot disk is running back to LSI. Then reboot the VM and make sure the PVSCSI driver is showing up (if its not you didn’t follow the above instructions) so go back and follow them so the PVSCSI driver is loaded and working, then shutdown and change the SCSI controller back to PVSCSI and you should be fine.

If the VM boots and one or more drives do not show up in my computer, go into Disk Manager and you may see the drives are marked as offline. Simply right click the drive and mark it as online and reboot and you’re good to go.

Summary:

If you have made the intelligent move to virtualize your business critical applications, firstly congratulations! However as with physical hardware, Virtual machines also have optimal configurations so make sure you use PVSCSI controllers with multiple virtual disks and have your DBA span the database across multiple virtual disks for maximum performance.

The following post shows how to do this in detail:

Splitting SQL datafiles across multiple VMDKs for optimal VM performance

If the DBA is not confident doing this, you can also just add multiple virtual disks (connected via multiple PVSCSI controllers) and create a stripe in guest (via Disk Manager) and this will also give you the benefit of multiple vdisks.

Related Articles:

1. Peak Performance vs Real World Performance

2. Enterprise Architecture & Avoiding tunnel vision

3. Microsoft Exchange 2013/2016 Jetstress Performance Testing on Nutanix Acropolis Hypervisor (AHV)

Peak performance vs Real World – Exchange on Nutanix Acropolis Hypervisor (AHV)

I wrote a post in April 2015 titled “Peak Performance vs Real World Performance” which discusses how benchmarks are not realistic and the performance shown in benchmarks can rarely be reproduced with real workloads. It has been one of my most popular posts, and I have had overwhelmingly positive feedback, with only a select few still pushing unrealistic peak performance benchmarks as being of value to customers.

I thought I would whip up a post showing an example of benchmarks vs real world performance requirements using MS Exchange Jetstress on Nutanix.

The below is a screen shot from Nutanix PRISM HTML based GUI showing a Virtual Machines Read/Write IOPS , bandwidth and latency during a MS Exchange Jetstress benchmark.

JetstressAHV20160105

The screen shot shows ~4000 Read IOPS and ~4000 Write IOPS at a latency of 1.59ms.

But what does the above really tell us and what does it mean to a customer?

I’ve been quoted as saying “Benchmarks are of little value without context specific to customer requirements!” and I stand by this statement.

Let’s now look at an example of a real customers requirement:

The below is from the Exchange server role requirements calculator and it is a screen shot from the Role requirements tab which shows an estimate of the IOPS required for the Databases and Logs for a single Exchange instance.

ExchangeIOexample

It shows the required IOPS being 536 for the databases and 115 for the logs.

Note: The sizing calculator was for an environment supporting 20000 mailboxes across 3 mailbox servers. As such, the above IO requirements are for ~6666 users.

So now that we have done the MS Exchange solution sizing (shown above is just the storage performance requirements), we understand the requirement to be around 651 mixed Read/Write IOPS per mailbox VM. We can then take a benchmark such as Jetstress and validate that the solution has sufficient storage performance.

To require the ~8000 IOPS the Jetstress test showed, we would need to scale up each Exchange instances to support have a much larger number of users and have each user send/receive 500 emails per day to reach this requirement.

8kJetstressIOPS

But in scaling up each Exchange instance to reach the peak IOPS that even this 3 year old generation Nutanix node can deliver we would vastly exceed the compute sizing recommendations for Exchange 2013 (being 24vCPUs and 96GB RAM) as shown by the calculator below.

ScaleUpExchange

As we can see, for an Exchange instance to require those peak IOPS, we would have to size the Mailbox server VMs with more than 10x the recommended vCPUs (24) and 15x the RAM (96GB). This shows that peak IOPS which can be achieved are not relevant in the real world.

In fact, Exchange generally does not require more than 1000 IOPS. Typically its requires much less, as my earlier example shows. So peak performance numbers are of little/no value as they can’t (and more importantly don’t need to be) reproduced in the real world.

With a tool like Jetstress we can configure a precise Mailbox profiles and test only what you require. If the solution can produce more IOPS than what you need (such as in this example), that’s fine for headroom, but in this day and age where Nutanix allows you to quickly and easily scale (Compute/Storage performance & capacity), I recommend designing for what you need in the foreseeable future (by this I mean 6-12 months) and scale if/when required.

What a benchmark does help you understand is how much headroom a solution has over and above your requirements which can help choose a solution to support mixed workloads, BUT the benchmark would need to be re-ran concurrently with suitable benchmarks for all other applications you intend on mixing to see how the solution behaves with mixed workloads.

As such, single application peak performance benchmarks are almost never valuable (to customers) unless your planning to run application specific silos. I strongly recommend anyone considering implementing an application specific silo, read the following article: Enterprise Architecture & Avoiding tunnel vision.

And… if you’re planning to run application specific silos and/or scaling up workloads to the point they need crazy IOPS, then you’re increasing the size of your failure domains, CAPEX and OPEX which is only doing yourself (or your customer) a disservice. But that’s a topic for another day.

I hope this example shows how real world requirements and performance is vastly different to what a benchmark shows and why peak performance benchmarks should be taken with a grain of salt.

I’ve always said the focus should be on gathering requirements and delivering on business outcomes, not focusing on performance which is typically only a very small part of a solution that delivers a successful business outcome.

Summary:

When sizing an MS Exchange solution on Nutanix, IOPS is not a constraining factor even for large scale deployments. The most common constraining factor is the Microsoft recommended compute maximums being 24 vCPUs and 96GB RAM, which is the same constraint regardless of if you run on Nutanix, or any other virtual / physical platform.

Related Articles: