vMotion issues when using NFS storage with vSphere 5.5 Update 2

When vMotioning a VM (specifically the .vswp file) residing on an NFS datastore you may see the following error.

vMotion fails with the error: remote host IP_Address failed with status Busy

This issue originally occurred in vSphere 4.1 but appears to have reappeared in vSphere 5.5 Update 2.

Luckily there is a workaround for now, until VMware can investigate and resolve the problem.

The workaround is to modify the advanced setting “Migrate.VMotionResolveSwapType” from the default of 1, to 0 on both the source and destination hosts. If you want to solve this for your entire cluster, then every host needs to be modified.

To modify the setting:
  1. Launch the vSphere Client and log in to your vCenter Server.
  2. Select the source ESX host and then click the Configuration tab.
  3. Click Software > Advanced Settings > Migrate.
  4. Under the Migrate options, locate the line containing Migrate.VMotionResolveSwapType. By default, it is set to 1.
  5. Change the value to 0.
  6. Click OK.
  7. Repeat Steps 2 to 6 for all hosts in the cluster.

The official VMware KB is below.

vMotion fails with the error: remote host IP_Address failed with status Busy(1031636)

Storage I/O Control (SIOC) configuration for Nutanix

Storage I/O Control (SIOC) is a feature introduced in VMware vSphere 4.1 which was designed to allow prioritization of storage resources during periods of contention across a vSphere cluster. This situation has often been described as the “Noisy Neighbor” issue where one or more VMs can have a negative impact on other VMs sharing the same underlying infrastructure.

For traditional centralized shared storage, enabling SIOC is a “No Brainer” as even using the default settings will ensure more Consistent performance during periods of storage contention which all but no downsides. SIOC does this by managing and potentially throttling the Device Queue depth based on “Shares” assigned to each Virtual Machine to ensure Consistent performance across ESXi hosts.

The below diagrams show the impact on three (3) identical VMs with the same Disk “Shares” values with and without SIOC in a traditional centralized storage environment (a.k.a SAN/NAS).

Without Storage I/O Control

NutanixWithoutSIOC

With Storage I/O Control

NutanixWithSIOC

 

As show in the above, where VMs have equal share values but reside on different ESXi hosts can result in an undesired result with one VM having double the available storage queue compared to the VMs residing on a different host.  In comparison, SIOC ensuring VMs with the same share value get equal access to the underlying storage queue.

While SIOC is an excellent feature, it was designed to address a problem which is no longer a significant factor with the Nutanix Scale out Shared nothing style architecture.

The issue of “noisy neighbour” or storage contention in the Storage Area Network (SAN) is all but eliminated as all Datastores (or “Containers” in Nutanix speak) are serviced by every Nutanix Controller VM in the cluster and under normal circumstances, upwards on 95% of read I/O is serviced by the local Controller VM, Nutanix refers to this feature as “Data Locality”.

Data Locality ensures data being written and read by a VM remains on the Nutanix node where the VM is running, thus reducing latency of accessing data across a Storage Area Network, and ensuring that a VM reading data on one node, has minimal or no impact on another VM on another node in the cluster.

As Write I/O is also distributed throughout the Nutanix cluster, which means no one single node is monopolized by (Write) replication traffic.

Storage I/O Control was designed around the concept of a LUN or NFS mount (from vSphere 5.0 onwards) where the LUN or NFS mount is served by a central storage controller, as is the most typical deployment in the past for VMware vSphere environment.

As such, SIOC limiting the LUN queue depth allowed all VMs on the LUN to have either an equal share of the available queue, OR by specifying “Share” values on a VM basis, ensure VMs can be prioritized based on importance.

By default, all Virtual Hard Disks have a share value of “Normal” (1000 shares). Therefore if a individual VM needs to be given higher storage priority, the Share value can be increased.

Note: In general, modifying VM Virtual disk share values should not be required.

As Nutanix has one Storage Controller (or CVM) per node which all actively service I/O to the Datastore, SOIC is not required, and provides no benefits.

For more information about SIOC in traditional NAS environments see: “Performance implications of Storage I/O Control – Enabled NFS Datastores in VMware vSphere 5.0

As such for Nutanix environments it is recommended SIOC control be disabled and DRS “anti-affinity” or “VM to host” rules be used to separate high I/O VMs.

Virtual Design Master 2015 Grand Prize!

Its Virtual Design master (vDM) competition time again and this year will be bigger and better than ever with Angelo LucianiEric Wright and Melissa Palmer leading the charge.

Last season, I participated as a Judge and found the experience very rewarding and I decided that in 2015, I would sponsor the Grand Prize.

In considering what would be a good prize, I thought about equipment for a home lab like a Synology NAS but thought of a much more valuable prize for a Virtual Design Master….. VCDX mentoring as well as the VCDX Application and Exam fees paid for!

So then the Virtual Design Master can truly become an industry recognized expert.

Let’s break it down:

The vDM will first need to complete the prerequisites certifications currently known as VMware Advanced Certified Professional (or VCAP) which will soon be changed to the VMware Certified Implementation Expert (or VCIX) in one of the certification streams and then it will be time for the mentoring to begin.

This will include (but not necessarily be limited too)

1. Initial design review and feedback session (~2hrs)
2. Follow up design review and feedback session (~2hrs)
3. Initial VCDX defence Powerpoint preparation session (~2hrs)
4. Follow up VCDX defence Powerpoint preparation session (~2hrs)
5. Initial VCDX Mock panel including Defence, Design and Troubleshooting Scenarios.  (~4hrs)
6. Follow up VCDX Mock panel including Defence, Design and Troubleshooting Scenarios.  (~4hrs)

I would like to be clear, the mentoring is not aimed to teach the vDM how to pass VCDX, but to guide the vDM to fully understanding their architecture and design choices along with alternatives and implications. With these skills the vDM will have a much better chance of passing but more importantly, will be a better architect / consultant as a result.

As a Bonus prize, the vDM will also receive a copy of my upcoming book which is yet to be announced.

I look forward to participating as a Judge again this season (time permitting) and can’t wait to work with the vDM to achieve his/her VCDX!

So what are you waiting for? Register here: http://www.virtualdesignmaster.com/register/

Best of luck to all the vDM contestants for 2015 from me and the rest of the vDM team!

vdm-group-4