SQL & Exchange performance in a Virtual Machine

The below is something I see far to often: An SQL or Exchange virtual machine using a single LSI Logic SAS virtual SCSI controller.

LSIlogic

What is even worse is a virtual machine using a single LSI controller and a single virtual disk for one or more databases and logs (as shown above).

Why is this so common?

Probably because the LSI Logic SAS controller is the default for Windows 2008/2012 virtual machines and additional SCSI controllers are not automatically added until you have more than 16 virtual disks for a single VM.

Why is this a problem?

The LSI controller has a queue depth limit of 128, compared to the default limit for PVSCSI which is 256, however it can be tuned to 1024 for higher performance requirements.

As a result, the a configuration with a single LSI controller and/or a limited number of virtual disks can artificially significantly constrain the underlying storage from delivering the performance it is capable of.

Another problem with the LSI controller is the amount of CPU it uses is higher than the PVSCSI controller for the same IO levels. This means you’re wasting virtual machine (and the underlying hosts) CPU resources unnecessarily.

Using more CPU could lead to other problems such as CPU Ready which can also lead to reduced performance.

A colleague and friend of mine, Michael Webster wrote a great post titled: Performance Issues Due To Virtual SCSI Device Queue Depths where he shows the performance difference between SATA, LSI and PVSCSI controllers. I highly recommend having a read of this post.

What is the solution?

Using multiple Paravirtual (PVSCSI) adapters with virtual disks evenly spread over the four controllers for Windows virtual machines is a no brainer!

This results in:

  1. Higher default queue depth
  2. Lower CPU overheads
  3. Higher potential performance

How do I configure this?

It’s fairly straight forward, but don’t just change the LSI Controller too PVSCSI as the Guest OS may not have the driver installed which will result in the VM failing to boot.

Too avoid this, simply edit the virtual machine and add a new Virtual Disk of any size and for the virtual device node, select SCSI (1:0) and follow the prompts.

VirtualDiskSCSI10

Once the new virtual disk is added you should see a new LSI Logic SAS SCSI controller is added as shown below.

NewLSIController

Next highlight the adapter and select “Change Type” in the top right hand corner of the window and select Paravirtual. Once this is complete you should see similar to the below:

AddPVSCSIController

Next hit “Ok” and the new Controller and virtual disk will be added to the VM.

Now we open the console of the VM and open Compute Management and goto Device Manager. Under Storage Controllers you should now see VMware PVSCSI Controller as shown below.

DeviceManagerPVSCSI

Now we are safe to Shutdown the VM.

Once the VM is shutdown, Edit the VM setting and highlight the SCSI Controller 0 and select Change Type as we did earlier and select Paravirtual. Once this is done you will see the original controller is replaced with a new controller.

ChangeLSItoPVSCSI

Now that we have the boot drive change to PVSCSI, we can now balance the data drives across up to four PVSCSI controllers for maximum performance.

To do this, simply highlight a Virtual Disk and drop down the Virtual Device Node and select SCSI (1:0) or any other available slot on the SCSI (1:x) controller.

ChangeControllerID

After doing this you will see new SCSI controllers appear and you need to change these to Paravirtual as we have done to the first controller.

ChangeControllerIDMultipleVdisks

For each of the virtual disks, ensure they are placed evenly across the PVSCSI controllers. For example, if you have a VM with eight virtual disks plus the OS disk, it should look like this:

Virtual Disk 1 (OS) : SCSI (0:0)
Virtual Disk 2 (OS) : SCSI (0:1)
Virtual Disk 3 (OS) : SCSI (1:0)
Virtual Disk 4 (OS) : SCSI (1:1)
Virtual Disk 5 (OS) : SCSI (2:0)
Virtual Disk 6 (OS) : SCSI (2:1)
Virtual Disk 7 (OS) : SCSI (3:0)
Virtual Disk 8 (OS) : SCSI (3:1)
Virtual Disk 9 (OS) : SCSI (0:2)

This results in two data virtual disks per PVSCSI controller which evenly distributes IO across all controllers with the exception being first controller (SCSI 0) also hosting the OS drive.

What if I have problems?

On occasions I have seen problems with this process which has resulted in VMs not booting, however these issues are easy to fix.

If your VM fails to boot with a message like “Operating System not found”, I suggest you panic! Just kidding, this is typically just the boot order of the Virtual machine has been screwed up. Just go into the bios and check the boot order has the PVSCSI controller showing and the correct virtual disk in first priority.

If the VM boots and BSOD or crashes and goes into a continuous reboot loop then power off the VM and set the first SCSI controller where the boot disk is running back to LSI. Then reboot the VM and make sure the PVSCSI driver is showing up (if its not you didn’t follow the above instructions) so go back and follow them so the PVSCSI driver is loaded and working, then shutdown and change the SCSI controller back to PVSCSI and you should be fine.

If the VM boots and one or more drives do not show up in my computer, go into Disk Manager and you may see the drives are marked as offline. Simply right click the drive and mark it as online and reboot and you’re good to go.

Summary:

If you have made the intelligent move to virtualize your business critical applications, firstly congratulations! However as with physical hardware, Virtual machines also have optimal configurations so make sure you use PVSCSI controllers with multiple virtual disks and have your DBA span the database across multiple virtual disks for maximum performance.

The following post shows how to do this in detail:

Splitting SQL datafiles across multiple VMDKs for optimal VM performance

If the DBA is not confident doing this, you can also just add multiple virtual disks (connected via multiple PVSCSI controllers) and create a stripe in guest (via Disk Manager) and this will also give you the benefit of multiple vdisks.

Related Articles:

1. Peak Performance vs Real World Performance

2. Enterprise Architecture & Avoiding tunnel vision

3. Microsoft Exchange 2013/2016 Jetstress Performance Testing on Nutanix Acropolis Hypervisor (AHV)

What’s .NEXT 2016 – ESXi Management from PRISM

Nutanix has always been designed to be hypervisor agnostic, and being the first and only HCI platform on the market to support VMware ESXi, Microsoft Hyper-V and KVM/AHV means we do not want to tie the management interface to any one platform.

In my opinion this is a huge advantage for many reasons including:

  1. Being able to provide a consistent management interface across hypervisors
  2. Not being dependant on 3rd party management components/interfaces
  3. Building our platforms management layer (PRISM) in a fully distributed and highly available manner.

However while Nutanix AHV solution is entirely managed by PRISM, customers who run ESXi are still experiencing pain due to having to use the vSphere Web Client for some tasks.

This is why it was importaint for Nutanix to provide the ability to perform ESXi VM Operations from Prism (as shown below) while providing the ability to centrally manage all Nutanix clusters running ESXi or AHV to minimise the dependancy on the the vSphere Web Client.

ManageESXifromPRISM

At this stage the new ESXi management capabilities of PRISM does not remove the requirement for vCenter so this is still a single point of failure for vSphere customers.

Enhanced functionality to further reduce the dependancy on the vSphere Web Client are also expected in a future release, but for now, many of the day to day VM operations will make life easier for vSphere customers.

Related .NEXT 2016 Posts

What’s .NEXT 2016 – Acropolis File Services (AFS)

At .NEXT 2015 Nutanix announced the Scale out File Server Tech Preview which was supported for AHV environments only. With the imminent release of AOS 4.7 the Scale out File Server has been renamed to Acropolis File Services (AFS) and will now be GA for AHV and ESXi.

AFS provides what I personally refer to as an “invisible” file server experience because it can be setup with just a few clicks in PRISM without the need to deploy operating systems.

AFS provides a highly available and distributed single namespace across 3 or more front end VMs which are automatically deployed and maintained by ADSF. The below shows a mixed cluster of 10 nodes made up of 8 x NX3060 and 2 x NX6035C nodes with the AFS UVMs spread across the cluster.

AFSoverview

Data is then stored on the underlying Acropolis Distributed Storage Fabric (ADSF) in a Container which can be configured with your desired level of resiliency e.g.: RF2 or RF3 as well as data reduction features such as Compression, Deduplication and Erasure coding.

AFS inherits all of the resiliency that ADSF natively provides and supports operational tasks such as one-click rolling upgrades of AOS and hypervisor without impacting the availability of the file services.

Functionality

Backups

Nutanix will provide AFS with native support for local recovery points on the primary storage (cluster) and allow both Async-DR (60 mins) and Sync-DR (0 RPO) to allow data to be backed up to remote cluster.

For customers who employ 3rd party backup tools, AFS can also be simply backed up as an SMB share which is a common capability amongst backup vendors such as Commvault and Netbackup.

The below shows a high level of what a 3rd party backup solution looks like with AFS.

AFSbackup2

Quotas

AFS also allows administrators to set quotas to help with capacity management especially in environments with multi-tenant or departmental deployments to avoid users monopolising capacity in the environment.

Patching/Upgrades

Acropolis File Server can be upgraded and patched separately to AOS and the underlying hypervisor. This ensures that the version of AFS is not dependant on the AOS or hypervisor versions which also makes QA easier and minimizes the chance of bugs since the AFS layer is abstracted from the AOS and hypervisor.

This is similar to how the AOS version is not dependant on a hypervisor version, ensuring maximum flexibility and stability for customers. This means as new features/improvements are added, AFS can be upgraded via PRISM without worrying about interoperability and dependancies.

Patches and upgrades are one-click, rolling, non-disruptive upgrades the same as AOS.

Scaling

As the file serving workload increases, Acropolis File Server can be scaled out by simply adding instances to balance the workload across. If the Nutanix cluster has more nodes than AFS instances, this can be done quickly and easily through prism.

If the cluster has for example 4 nodes and 4 AFS instances are already deployed, then to scale the performance of the AFS environment the UVMs vCPU/vRAM can be scaled up OR additional nodes can be added to the cluster and AFS instances scaled out.

When one or more additional AFS instances (UVM) are added, the workload is automatically balanced across all UVMs in the environment. ADSF will also automatically balance the new and existing file server data across the ADSF cluster to ensure even capacity utilization across nodes as well as consistent performance and linear scaling.

So in short, AFS provides both scale up and scale out options.

Interoperability with Storage Only nodes

Acropolis File Server is fully supported on environments using storage only nodes. As the storage nodes provide a Nutanix CVM and underlying storage to ADSF, the available capacity and performance is made available to AFS just like it is to any other VM. The only requirement is 3 or more Compute+Storage nodes in a cluster to support the minimum 3 AFS UVMs.

AFS deployment examples

Acropolis File Services can be deployed on existing Nutanix clusters which allows file data to be co-located on the same storage pool with existing data from virtual machines as well as with physical or virtual servers utilising Acropolis Block Services (ABS).

AFS_ExistingCluster

Acropolis File Services can be deployed on dedicated clusters such as storage heavy and storage only nodes for environments which do not have virtual machines, or for very large environments while be centrally managed along with other Nutanix clusters via PRISM Central.

AFS_DedicatedCluster

Multi-tenancy

AFS also allows multiple seperate instances to be deployed in the same Nutanix cluster to service different security zones, tenants or use cases. The following shows an example of a 4 node Nutanix cluster with two instances of AFS. The first has 4 AFS instances (UVMs) and the second has just 3 instances. Each instance can have different data reduction (Compression, Dedupe,EC-X) settings and be scaled independently.
AFSMultipleFileServers

Summary:

  • AFS supports multiple hypervisors and is deployed in mins from PRISM
  • Can be scaled both up and out to support more users, capacity and/or performance
  • Interoperable with all OEMs and node types including storage only
  • Supports non-disruptive one-click rolling upgrades
  • Supports multiple AFS instances on the one cluster for multi-tenancy and security zone support
  • Has native local recovery point support as well as remote backup (Sync and Async) support
  • All data is protected by the underlying ADSF
  • Supports all ADSF data reduction technologies including Compression, Dedupe and Erasure Coding.
  • Eliminates the requirement for a silo for File sharing
  • Capacity available to AFS is automatically expanded as nodes are added to the cluster.

Related .NEXT 2016 Posts