Acropolis: VM High Availability (HA)

This past week at Nutanix .NEXT, Acropolis was officially announced although it has actually been available and running in many customer environments (1200+ nodes globally) for a long time.

One of the new features is VM High Availability.

As with everything Nutanix, VM HA is a very simple yet effective feature. Let’s go through how to configure HA via the Acropolis/PRISM HTML 5 interface.

As shown below, using the “Options” menu represented by the cog, there is an option called “Manage VM High Availability”.

HAMenu

The Manage VM High Availability has 2 simple options shown below:

1. Enable VM High Availability (On/Off)
2: Best Effort / Reserve Space

Best Effort works as you might expect where in the event of a node failure, VMs are powered on throughout the cluster if resources are available.

In the event resources e.g.: Memory, are not available then some/all VMs may not be powered on.

HAonBestEffort

Reserve Space also works as you might expect by reserving enough compute capacity within the cluster to tolerate either one or two node failures. If RF2 is configured then one node is reserved and if RF3 is in use, two nodes are reserved.

Pretty simple right!

HAonReserveSpace

The best part about Reserve Space is its like “Host failures cluster tolerates” in vSphere, however without using the potentially inefficient slot size algorithm.

Once HA is enabled, it appears on the Home screen of PRISM and gives a summary of the VMs which are On,Off and Suspended as shown below.

HAHomeScreen

HA can also be enabled/disabled on a per VM basis via the VMs tab. Simply highlight the VM and click “Update” as shown below.

VMHAupdate

Then you will see the “Update VM” popup appear. Then simply Enable HA.

VMHA

In the above screenshot you can see that the popup also warns you if HA is disabled at the cluster level and allows you to jump straight to the Manage VM High Availability configuration menu.

So there you have it, Acropolis VM High Availability, simple as that.

Related Articles:

1. Acropolis: Scalability
2. What’s .NEXT? – Acropolis!
3. What’s .NEXT? – Erasure Coding!

 

 

How to successfully Virtualize MS Exchange – Part 5 – High Availability (HA)

HA has two main configuration options which can significantly impact the availability and consolidation of any vSphere environment but can have an even higher impact when talking about Business Critical Applications such as MS Exchange.

When considering MS Exchange MBX or MSR VMs can be very large in terms of vCPU and vRAM, understanding and choosing an appropriate setting is critical for the success of not only the MS Exchange deployment but any other VMs which are sharing the same HA cluster.

Let’s start with the “Admission Control Setting“.

Admission control can be configured in either “Enabled” or “Disabled” mode. “Enabled” means that if the Admission Control Policy (discussed later in this post) is going to be breached by powering on one or more VMs, the VM will not be permitted to power on, which guarantees a minimum level of performance for the running VMs.

If the setting is “Disabled” it means no matter what, VMs will be powered on. In this situation, it leads to the possibility of significant contention for compute resources which for MS Exchange MBX or MSR VMs would not be ideal.

As a result, it is my strong recommendation that the “Admission Control Setting” be set to “Enabled”.

Next lets discuss the “Admission Control Policy“.

There are three policies to choose from (shown below) each with their pros and cons.

AdmissionControlPolicies

1. Host failures the cluster tolerates

This option is the default and most conservative option. However it calculates the utilization of the cluster using what many describe as a very inefficient algorithm using what is called “slot sizes”.

A slot size is calculated as the largest VM from a vCPU perspective, AND the largest VM from a vRAM perspective and combines the two. Then the cluster will calculate how many “slots” the cluster can support.

The issue with this is for environments with a range of VM sizes, a small VM of 1vCPU and 1Gb RAM uses 1 slot, as would a VM with 8vCPU & 64Gb VM. This leads to the cluster having very low consolidation ratio and leads to unnecessary high numbers of ESXi hosts and underutilization.

As such, this is not recommended for environments with mixed VM sizes, such as MS Exchange MBX or MSR combined with VMs such as Domain Controllers.

2. Specify failover hosts

Specify failover hosts is a very easy setting to understand. You specify a failover host, and it does exactly that, acts as a failover host so if one host fails, all the VMs fail over onto the failover host.

Great, but the ESXi host then remains powered on doing nothing until such time there is a failure. So the fail-over HW provides no value during normal operations.

As such, this setting is not recommended.

3. Percentage of cluster resources reserved as failover spare capacity

This setting also is fairly easy to understand at a high level, although under the covers is more complicated and does not work how many people believe it does.

With that being said, it is a very efficient policy for environments with large VMs like Exchange MBX or MSR.

It avoids the inefficient “slot size” calculation, and works on virtual machines reservations to calculate cluster capacity.

For VMs with no reservation, 32Mhz and 0Mb ram (plus memory overhead) is used from vSphere 5.0 onwards. However for Exchange MBX/MSR VMs which as discussed in Part 3 should have Memory Reservations, HA will then use the full reserved memory to ensure sufficient cluster capacity for the Exchange VM to fail over without impacting memory performance. Now this is great news as we don’t want to overcommit memory for Exchange even in a failure scenario.

From a CPU perspective, 32Mhz will be the default reserved for any Exchange MBX or MSR VM which does not have a CPU reservation, so it makes sense from a HA perspective to use CPU reservations for Exchange VMs to ensure sufficient capacity exists within the cluster to tolerate an ESXi host failure.

CPU reservations will be discussed in more detail in a future post in this series.

As a result, I recommend using “Percentage of cluster resources reserved as failover spare capacity” for the admission control policy for Exchange environments.

Next we need to discuss what is the most suitable percentage to set for CPU and RAM.

The below table shows the required percentage for N+1 (Green) and N+2 (Blue) deployments based on the number of nodes in a vSphere HA cluster.

Table 1:

AdmissionControlPercetage

The above is generally what I recommend as N+2 provides excellent availability, including being able to tolerate a failure during maintenance or multiple host failures concurrently with little or no impact to performance after VMs restart.

So for clusters of sub 16 ESXi hosts, N+1 can be considered, but I recommended N+2 for greater than 16 ESXi host clusters.

The next table shows the required percentage for a cluster scaling from N+1 availability for up to 8 hosts, N+2 for up to 16 host, N+3 for up to 24 hosts and N+4 for the current maximum vSphere cluster of 32.

Table 2:

originalhapercetages

Its safe to say the above table is quite a conservative option (going up to N+4), however depending on business requirements these HA reservation values may be perfectly suited and are worth considering.

For more information see:
1. Example Architectural Decision – Admission Control Setting and Policy
2. Example Architectural Decision – VMware HA – Percentage of Cluster Resources Reserved for HA

Next lets discuss the “HA Virtual Machine Options“.

The below shows the “Cluster default settings” along with the “Virtual Machine settings” which allow you to override the cluster settings.

HAVMoptions

For the “VM restart priority”, I recommend leaving the “Cluster default setting” as “Medium” (Default).

For “Host Isolation Response” this heavily depends on your underlying storage and availability requirements, as such, I will address this setting in detail later in this series.

For the “VM Restart Priority” under “Virtual Machine Settings“, we have a number of options. If a DAG is being used, one option would be to Disable VM Restart and depend solely on the DAG for availability.

This has the advantage of reducing the compute requirements for the cluster to satisfy HA and gives the same level of availability as a DAG, which in many cases will meet the customers requirements.

Alternatively, the Exchange MBX or MSR VMs could be configured to “High” to ensure they are started asap following a failure, above less critical VMs such as Testing/Development.

Regarding “Datastore Heartbeating” and “VM Monitoring“, these will be discussed in future posts.

Recommendations for HA:

1. The “Admission Control Setting” be set to “Enabled”.
2. The “Admission control policy” to set to “Percentage of cluster resources reserved as failover spare capacity”
3. The “percentage of cluster resources reserved as failover spare capacity” be configured as per Table 1 (at a minimum).

Recommendations for HA Virtual Machine Options:

1. Do not disable HA restart for Exchange MBX or MSR VMs
2. Leave “HA restart priority” for Exchange MBX or MSR VMs to “Normal” for DAG deployments
3. Set “HA restart priority” for Exchange MBX or MSR VMs to “High” for non DAG deployments

Back to the Index of How to successfully Virtualize MS Exchange.

Example Architectural Decision – Host Isolation Response for a Nutanix Environment

Problem Statement

What are the most suitable HA / host isolation response when using Nutanix?

Assumptions

1. vSphere 5.0 or greater
2. Two x 10GB Network interfaces are shared for Nutanix Storage Traffic and Virtual Machine Traffic

Motivation

1. Minimize the chance of a false positive isolation response
2. Ensure in the event the storage is unavailable that virtual machines are promptly shutdown to enable HA to recover the VMs in a timely manner (where other hosts are unaffected by isolation) and to prevent a “split brain” scenario
3. Ensure maximum availability

Architectural Decision

Turn off the default isolation address and configure the below specified isolation addresses, which check connectivity to multiple Nutanix Controller VMs (CVMs) on the IP Storage VLAN.

Configure the following Isolation addresses

das.isolationaddress1 : NDFS Cluster IP Address

Configure Host Isolation Response to: Power Off

For Nutanix Controller VMs override the cluster setting and configure Host Isolation Response to “Leave Powered On”

Justification

1. The ESXi Management traffic along with the Virtual machine traffic and inter-Nutanix node storage traffic is running over 2 x 10GB connections. Using the ESXi management gateway (default isolation address) to check for isolation is not suitable as the management network can be offline without impacting the IP storage or data networks. This situation could lead to false positives isolation responses.
2. The isolation addresses chosen tests IP storage connectivity over the converged 10Gb network and in the event this is unavailable, there is no point testing further connectivity as Virtual machines cannot function without their storage
3. In the event the Nutanix cluster IP address cannot be reached by ICMP the Node will not be able to properly function. As such, triggering isolation response and powering off the VMs based on this criteria is logical as the VMs will not be able to function under these conditions.
4. In the event the NDFS Cluster IP address does not respond to ICMP on the Management interfaces it is likely there has been an isolation event OR a catastrophic failure in the environment, either to the network, or the storage controllers themselves, in which case the safest option is to Power Off the VMs.
5. In the event the isolation response is triggered and the isolation does not impact all hosts within the cluster, the VMs can be restarted by HA onto a surviving host and resume functioning
6. Using the Nutanix Controller VM (CVM) IP address (192.168.5.2) for the Isolation address is not suitable as this address exists on each ESXi hosts and as such it is not a good candidate for isolation detection as the host will always be able to find this address even when the network is offline due to the CVM being local to the host
7. The Nutanix Controller VM accesses local storage and can continue to run locally even in an isolation event. When the isolated event is over, the CVM will then regain connectivity to the other CVMs in the Nutanix cluster.
8. Shutting down the CVM would only increase the recovery time once the isolation even is over and has no added benefits.

Implications

1. In the event the host cannot reach any of the isolation addresses, virtual machines will be powered off.
2. Initial cluster setup would require the vSphere administrator to override the Cluster settings for each Controller VM. Note: This is a one time task (Set & Forget)

Alternatives

1. Set Host isolation response to “Leave Powered On”
2. Do not use Datastore heartbeating
3. Use the default isolation address
4. Leave the CVM on the default cluster setting and “Shutdown” on isolation

Related Articles

1. VMware Host Isolation Response in a Nutanix Environment #NoSAN

2. Storage DRS and Nutanix – To use, or not to use, that is the question?

3. VMware HA and IP Storage