SQL & Exchange performance in a Virtual Machine

The below is something I see far to often: An SQL or Exchange virtual machine using a single LSI Logic SAS virtual SCSI controller.

LSIlogic

What is even worse is a virtual machine using a single LSI controller and a single virtual disk for one or more databases and logs (as shown above).

Why is this so common?

Probably because the LSI Logic SAS controller is the default for Windows 2008/2012 virtual machines and additional SCSI controllers are not automatically added until you have more than 16 virtual disks for a single VM.

Why is this a problem?

The LSI controller has a queue depth limit of 128, compared to the default limit for PVSCSI which is 256, however it can be tuned to 1024 for higher performance requirements.

As a result, the a configuration with a single LSI controller and/or a limited number of virtual disks can artificially significantly constrain the underlying storage from delivering the performance it is capable of.

Another problem with the LSI controller is the amount of CPU it uses is higher than the PVSCSI controller for the same IO levels. This means you’re wasting virtual machine (and the underlying hosts) CPU resources unnecessarily.

Using more CPU could lead to other problems such as CPU Ready which can also lead to reduced performance.

A colleague and friend of mine, Michael Webster wrote a great post titled: Performance Issues Due To Virtual SCSI Device Queue Depths where he shows the performance difference between SATA, LSI and PVSCSI controllers. I highly recommend having a read of this post.

What is the solution?

Using multiple Paravirtual (PVSCSI) adapters with virtual disks evenly spread over the four controllers for Windows virtual machines is a no brainer!

This results in:

  1. Higher default queue depth
  2. Lower CPU overheads
  3. Higher potential performance

How do I configure this?

It’s fairly straight forward, but don’t just change the LSI Controller too PVSCSI as the Guest OS may not have the driver installed which will result in the VM failing to boot.

Too avoid this, simply edit the virtual machine and add a new Virtual Disk of any size and for the virtual device node, select SCSI (1:0) and follow the prompts.

VirtualDiskSCSI10

Once the new virtual disk is added you should see a new LSI Logic SAS SCSI controller is added as shown below.

NewLSIController

Next highlight the adapter and select “Change Type” in the top right hand corner of the window and select Paravirtual. Once this is complete you should see similar to the below:

AddPVSCSIController

Next hit “Ok” and the new Controller and virtual disk will be added to the VM.

Now we open the console of the VM and open Compute Management and goto Device Manager. Under Storage Controllers you should now see VMware PVSCSI Controller as shown below.

DeviceManagerPVSCSI

Now we are safe to Shutdown the VM.

Once the VM is shutdown, Edit the VM setting and highlight the SCSI Controller 0 and select Change Type as we did earlier and select Paravirtual. Once this is done you will see the original controller is replaced with a new controller.

ChangeLSItoPVSCSI

Now that we have the boot drive change to PVSCSI, we can now balance the data drives across up to four PVSCSI controllers for maximum performance.

To do this, simply highlight a Virtual Disk and drop down the Virtual Device Node and select SCSI (1:0) or any other available slot on the SCSI (1:x) controller.

ChangeControllerID

After doing this you will see new SCSI controllers appear and you need to change these to Paravirtual as we have done to the first controller.

ChangeControllerIDMultipleVdisks

For each of the virtual disks, ensure they are placed evenly across the PVSCSI controllers. For example, if you have a VM with eight virtual disks plus the OS disk, it should look like this:

Virtual Disk 1 (OS) : SCSI (0:0)
Virtual Disk 2 (OS) : SCSI (0:1)
Virtual Disk 3 (OS) : SCSI (1:0)
Virtual Disk 4 (OS) : SCSI (1:1)
Virtual Disk 5 (OS) : SCSI (2:0)
Virtual Disk 6 (OS) : SCSI (2:1)
Virtual Disk 7 (OS) : SCSI (3:0)
Virtual Disk 8 (OS) : SCSI (3:1)
Virtual Disk 9 (OS) : SCSI (0:2)

This results in two data virtual disks per PVSCSI controller which evenly distributes IO across all controllers with the exception being first controller (SCSI 0) also hosting the OS drive.

What if I have problems?

On occasions I have seen problems with this process which has resulted in VMs not booting, however these issues are easy to fix.

If your VM fails to boot with a message like “Operating System not found”, I suggest you panic! Just kidding, this is typically just the boot order of the Virtual machine has been screwed up. Just go into the bios and check the boot order has the PVSCSI controller showing and the correct virtual disk in first priority.

If the VM boots and BSOD or crashes and goes into a continuous reboot loop then power off the VM and set the first SCSI controller where the boot disk is running back to LSI. Then reboot the VM and make sure the PVSCSI driver is showing up (if its not you didn’t follow the above instructions) so go back and follow them so the PVSCSI driver is loaded and working, then shutdown and change the SCSI controller back to PVSCSI and you should be fine.

If the VM boots and one or more drives do not show up in my computer, go into Disk Manager and you may see the drives are marked as offline. Simply right click the drive and mark it as online and reboot and you’re good to go.

Summary:

If you have made the intelligent move to virtualize your business critical applications, firstly congratulations! However as with physical hardware, Virtual machines also have optimal configurations so make sure you use PVSCSI controllers with multiple virtual disks and have your DBA span the database across multiple virtual disks for maximum performance.

The following post shows how to do this in detail:

Splitting SQL datafiles across multiple VMDKs for optimal VM performance

If the DBA is not confident doing this, you can also just add multiple virtual disks (connected via multiple PVSCSI controllers) and create a stripe in guest (via Disk Manager) and this will also give you the benefit of multiple vdisks.

Related Articles:

1. Peak Performance vs Real World Performance

2. Enterprise Architecture & Avoiding tunnel vision

3. Microsoft Exchange 2013/2016 Jetstress Performance Testing on Nutanix Acropolis Hypervisor (AHV)

How to successfully Virtualize MS Exchange – Part 17 – Virtual Machine Storage Configuration

In addition to Part 16 where we discussed Virtual Disk Provisioning options and recommendations, In this part we will cover how to optimally configure a Virtual Machine for an Exchange MBX/MSR workload from a virtual storage controller perspective.

Once you have made the decision on storage platform, and assuming you have chosen to use VMFS or NFS datastores (and not iSCSI in-Guest or RDMs), then this article is for you.

Virtual Machines just like physical servers, have SCSI controllers (albeit virtual) and ESXi has a number of options to choose from which include:

1. BusLogic Parallel
2. LSI Logic Parallel
3. LSI Logic SAS
4. Paravirtual SCSI (PVSCSI)
5. AHCI SATA Controller

By default when creating a new virtual machine the default adapter for Windows 2008 and 2012 is “LSI Logic SAS” because Windows does not have the PVSCSI driver by default.

BusLogic ParallelLSI Logic Parallel adapters are not recommended for Windows 2008/2012 as they are legacy controllers with lower performance, as such I will not cover these in any more detail as they are irrelevant to Exchange deployments.

Instead I will cover the LSI Logic SASAHCI SATA Controller and Paravirtual SCSI (PVSCSI) adapters.

Starting with LSI Logic SAS,

This is the default controller for Windows 2008/2012 VMs, as a result, it is very common to see Exchange deployments using this controller. It has good performance and works out of the box with a Windows install without requiring drivers.

Advantages:

1. The default Controller for Windows 2008/2012
2. No need for manually inserting drivers to install Windows
3. Higher performance than AHCI SATA controller

Disadvantages:

1. Lower performance than PVSCSI
2. Higher CPU overheads in Guest compared to PVSCSI
3. Higher latency than PVSCSI
4. Lower maximum number of VMDKs supported per controller (15) compared to AHCI SATA (30)

Next let’s discuss the AHCI SATA Controller.

The AHCI SATA controller is new in vSphere 5.5 and is only supported in Virtual Machines with Hardware version 10. The SATA controller can be used on its own or in addition to LSI or PVSCSI controllers to provide additional VMDKs / Capacity which increases a single VMs maximum capacity from ~3.7PB to over 11PB.

Advantages:

1. Can support 30 VMDKs per Controller (120 total) compared to 15 for LSI / PVSCSI
2. Can be used in addition to PVSCSI controllers to provide more storage performance and capacity per Exchange VM (if required)
3. High capacity supported per controller than LSI Logic / PVSCSI

Disadvantages:

1. Higher CPU utilization per IO compared to LSI / PVSCSI options
2. Lower overall performance compared to LSI and PVSCSI
3. Higher latency compared to LSI and PVSCS

And Finally the Paravirtual SCSI Controller.

The PVSCSI controller is the highest performing controller and has been supported since ESXi 4.0 and are design for high performance storage environments and are available for virtual machines running hardware version 7 and later.

Advantages:

1. Performance , Performance , Performance. Oh yeah and did I mention performance?
2. Lower Latency and Higher IOPS compared to other controllers
3. Lower CPU overhead on the Guest OS (and therefore ESXi)
4. More CPU is available for Exchange due to lower CPU overheads

Disadvantages:

1. Windows Failover Clustering is not supported, but this has no impact on MS Exchange including DAG deployments.
2. PVSCSI is not the default and requires inserting drivers into the Windows installation OR the VM to be built on LSI Logic SAS and once VMware Tools is installed, swapping to PVSCSI.
3. Lower maximum VMDKs supported per controller (15) compared to AHCI SATA (30)

Performance Comparison

From a performance perspective, Michael Webster (VCDX#66) wrote this great post “VMware vSphere 5.5 Virtual Storage Adapter Performance” and produced the following graph showing a comparison between SATA, LSI Logic SAS and PVSCSI controllers from an IOPS, Latency perspective.

VMware-vSphere-5.5-Virtual-Storage-Adapter-Performance

As we can see, the PVSCSI adapter has significantly lower latency and higher IOPS than the SATA and LSILogic SAS controllers even when running on the same underlying storage.

While the Microsoft Exchange team have managed to successfully reduce I/O throughout the versions (2007-2013) the performance advantages also have a positive benefit on vCPU utilization.

Michael’s post states:

It (PVSCSI Controller) also had the lowest CPU usage. During the 32 OIO test SATA showed 52% CPU utilization vs 45% for LSI Logic SAS and 33% for PVSCSI.

What this means is less CPU utilization is used for I/O and lower average latency means more CPU is available for MS Exchange along with less CPU WAIT time (where the CPU is waiting for IO to complete before continuing). This means your onto a winner especially considering Exchange 2013 is very CPU intensive.

Which Controller should be used for Exchange VMs?

VMware have published the KB article “Do I choose the PVSCSI or LSI Logic virtual adapter on ESX\ESXi 4.0 for non-IO intensive workloads? (1017652)” which in summary explains:

The test results show that PVSCSI is better than LSI Logic, except under one condition–the virtual machine is performing less than 2,000 IOPS and issuing greater than 4 outstanding I/Os. This issue is fixed in vSphere 4.1 and later version, so that the PVSCSI virtual adapter can be used with good performance, even under this condition.

 

As the one caveat prior to vSphere 4.1 where LSI Logic can outperform PVSCSI, there are no significant downsides to using the PVSCSI compared to LSI as such, I recommend always using (multiple) PVSCSI adapters.

Now that we have decided on the PVSCSI adapter, what’s next?

As with physical servers, Virtual SCSI controllers including PVSCSI have their limits in terms of performance and scalability. To ensure maximum scalability, performance and low latency, multiple PVSCSI adapters should be used with all VMDKs evenly spread over the PVSCSI adapters as recommended in Part 11.

To do this, when adding a VMDK to the Exchange VM, ensure you select a different SCSI controller (which are created automatically on demand) by using the drop down box “Virtual Device Node” and selecting for example SCSI (1:0) as shown below.

MSRVMPVSCSI10

For subsequent VMDKs you must then select SCSI (2:0) as shown below.

MSRVMPVSCSI20

And then SCSI (3:0)

MSRVMPVSCSI30

For the forth VMDK, you then select SCSI (0:1) because SCSI (0:0) is taken by the VMDK used for the guest OS.

MSRVMPVSCSI01

Repeat the above process until you have sufficient VMDKs for your Exchange server VM.

The following illustrates my recommended configuration showing how to configure a VM supporting 8 database drives and 8 log drives.PVSCSIVMDKs

The above configuration will ensure maximum storage performance and can be expanded in the same configuration to support more than 3 times the number of databases + logs shown above and as such it is suitable for even very large (scale-up) Exchange MBX/MSR VMs.

For example, if each VMDK in the above configuration was just 4TB in size it would give you 64TB usable capacity and the VM can be scaled more than 3x the number of VMDKs.

Note: VMDKs can scale to 62TB (from vSphere 5.5) each although this may result in reduced performance.

TIP: Don’t forget to spread VMDKs evenly across datastores as per the recommendation in Part 11.

Recommendations for Exchange VM Storage Configuration:

1. Use multiple Paravirtual SCSI (PVSCSI) Adapters.
2. Use one VMDK per Database or Logs
3. Spread VMDKs evenly across multiple PVSCSI adapters
4. Spread VMDKs evenly across multiple datastores when using VMFS datastores
5. Spread VMDKs evenly across multiple datastores when using NFS datastores ensuring NFS datastores are served via multiple NAS controllers
6. Use more VMDKs as opposed to fewer larger VMDKs
7. Format NTFS volumes with an Allocation Unit Size of 64k
8. Keep it simple, do not mix virtual SCSI controller types.

Back to the Index of How to successfully Virtualize MS Exchange.

Integrity of I/O for VMs on NFS Datastores – Part 1 – Emulation of the SCSI Protocol

This is the first of a series of posts covering how the Integrity of I/O is ensured for Virtual Machines when writing to VMDK/s (Virtual SCSI Hard Drives) running on NFS datastores presented via VMware’s ESXi hypervisor as a “Datastore”.

Note: To be crystal clear, this post is not talking about presenting NFS direct to Windows or any other guest operating system.

This process is patented (US7865663) by VMware and its inventors and on the patent the process is called “SCSI Protocol Emulation”.

This series will first cover the topics in a vendor agnostic manner, meaning I am talking in general about VMware + any NFS storage on the VMware HCL with NFS support.

Following the vendor agnostic posts, I will follow with a series of posts focusing specifically on Nutanix, as the motivation for the series was to cover off this topic for existing or potential Nutanix customers, some of whom are less familiar with NFS and have asked for clarification, especially around virtualizing Business Critical Applications (vBCA) such as Microsoft SQL and Exchange.

The below diagram visualizes shows how storage can be presented to an ESXi host and what this series will focus on.

A VM accesses its .vmx and .vmdk file/s via a datastore the same way, regardless of the underlying storage protocol (DAS SCSI, iSCSI , NFS , FCP).

GUID-AD71704F-67E4-4AC2-9C22-10B531755566-high

In the case of NFS datastores, SCSI protocol emulation is used to allow the Guest Operating System (OS) and application/s to read and write via SCSI even when the underlying storage (which is abstracted by the hypervisor) is served via NFS which does not natively support the same commands.

Image Source: https://pubs.vmware.com/vsphere-50/index.jsp?topic=%2Fcom.vmware.vsphere.introduction.doc_50%2FGUID-2E7DB290-2A07-4F54-9199-B68FCB210BBA.html

In the following section, and throughout this series, many images shown are from the patent (US7865663) and are the property of the patent owners, not the author of this article.

The areas which I will be focusing on are the ones where there has been the most concern in the industry, especially for business critical applications, such as Microsoft SQL and Microsoft Exchange, being how are the VM operating system and application/s (or data integrity) are impacted when issuing commands when the storage is abstracted by the hypervisor and served to via NFS which does not have equivalent I/O commands as SCSI.

Some examples areas of concern around the industry for VMs running on datastores backed by NFS are:

1. SCSI Aborts / Resets
2. Forced Unit Access (FUA) & Write Through
3. Write Ordering
4. Torn I/O (Writes + Reads)

In this first part, we will look at the SCSI Protocol Emulation process and discuss SCSI Aborts and Resets and how the SCSI protocol emulation process deals with these.

Below is a diagram showing the flow of an I/O request for a VM writing SCSI commands to a VMDK (formatted as NTFS) through the SCSI emulation process and through to the NFS storage.

US07865663-20110104-D00005

The first few steps in my opinion are fairly self explanatory, where it gets interesting for me, and one of the points of contention among I.T professional (being SCSI aborts) is described in the box labelled “550“.

If the SCSI command is an abort (which has no equivalent in the NFS protocol), the SCSI emulation process removes the corresponding request from the virtual SCSI request list created in the previous step (box labelled “540“).

The same is true if the SCSI command is a reset (which also has no equivalent in the NFS protocol), the SCSI emulation process removes the corresponding request from the virtual SCSI request list. This process is shown below in the box labelled “560

US07865663-20110104-D00006

Next lets look at what happens if the SCSI “abort” or “reset” command is issued after the SCSI emulation process has passed on the command to the storage and is now receiving a reply to a command which the Guest OS / Application has aborted?

Its quite simple, the SCSI emulation process receives a reply from the NFS server, looks up the corresponding tag in the Virtual SCSI request list, and because this corresponding tag does not exist, the emulator drops the reply therefore emulating a SCSI abort command.

The process is shown below from box labelled “710” to “720” and finishing at “730“.

US07865663-20110104-D00007

In the patent, the above process is summed up nicely in the following paragraph.

Accordingly, a faithful emulation of SCSI aborts and resets, where the guest OS has total control over which commands are aborted and retried can be achieved by keeping a virtual SCSI request list of outstanding requests that have been sent to the NFS server. When the response to a request comes back, an attempt is made to find a matching request in the virtual SCSI request list. If successful, the matching request is removed from the list and the result of the response is returned to the virtual machine. If a matching request is not found in the virtual SCSI request list, the results are thrown away, dropped, ignored or the like.

So there we have it, that is how VMware’s patented SCSI Protocol emulation allows SCSI commands not supported natively by NFS to be honoured, therefore allowing applications dependant on Block based storage to be ran successfully within a VM where its VMDK is backed by NFS storage.

Let’s recap what we have learned so far.

1. The SCSI Commands, abort & reset have no equivalent in the NFS protocol.
2. The VMware SCSI Emulation process handles SCSI commands not supported natively by NFS thanks to the Virtual SCSI Request List.
3. Guest Operating Systems and Applications running in Virtual Machines on ESXi issue native SCSI commands to the NTFS volume, which is presented to the VM via a VMDK and housed on an NFS datastore.
4. The underlying NFS protocol is not exposed to the Guest OS, Application/s or Virtual Machine.
5. The SCSI Commands, abort & reset are emulated by the hyper visor through removing these requests from the Virtual SCSI emulation list.

In part two, I will discuss Forced Unit Access (FUA) & Write Through.

Integrity of Write I/O for VMs on NFS Datastores Series

Part 1 – Emulation of the SCSI Protocol
Part 2 – Forced Unit Access (FUA) & Write Through
Part 3 – Write Ordering
Part 4 – Torn Writes
Part 5 – Data Corruption

Nutanix Specific Articles

Part 6 – Emulation of the SCSI Protocol (Coming soon)
Part 7 – Forced Unit Access (FUA) & Write Through (Coming soon)
Part 8 – Write Ordering (Coming soon)
Part 9 – Torn I/O Protection (Coming soon)
Part 10 – Data Corruption (Coming soon)

Related Articles

1. What does Exchange running in a VMDK on NFS datastore look like to the Guest OS?
2. Support for Exchange Databases running within VMDKs on NFS datastores (TechNet)
3. Microsoft Exchange Improvements Suggestions Forum – Exchange on NFS/SMB
4. Virtualizing Exchange on vSphere with NFS backed storage?