Example Architectural Decision – Enhanced vMotion Compatiblity

Problem Statement

The virtual infrastructure is required to scale over time as demand for compute and/or availability increases.
When purchasing additional ESXi hosts over an expected ESXi host hardware life of >=3 year it is unlikely that the exact make/model of server or CPU type will be available. The solution needs to ensure full functionality across ESXi hosts (specifically vMotion) which may not be exactly the same hardware, although all processors will always be from the same vendor.

How can the vSphere cluster/s be configured for maximum flexibility without significant impact to Virtual machine performance?

Assumptions

1. All CPU types will be Intel or AMD but not a mix of the two
2. All CPUs will have a supported EVC mode

Motivation

1. Ensure full functionality between ESXi hosts whos Intel CPUs may not match exactly
2. Prevent having to purchase large volumes of identical hardware at one time
3. Allow vSphere clusters to be expanded over time using similar, but not identical hardware although maintaining the same CPU make.

Architectural Decision

Enable EVC and maintain it at the maximum supported EVC level for all ESXi hosts in each vSphere cluster.

Justification

1. vMotion is a requirement for the cluster/s to ensure maximum flexibility
2. It is essential to avoid downtime where possible. EVC ensures VMs can be vMotion’d to newer hosts for the purpose of expanding a cluster, OR alternatively, to newer hardware so older hardware can be decommissioned without impact to the VM.
3. The EVC level for the cluster can be increased without downtime
4. Having EVC disabled would require virtual machines being migrated to new hardware have downtime where CPU types are not similar
5. If EVC was not enabled, newer hardware may be placed into a new (smaller) cluster/s and this would add an unnecessary HA overhead as well as reduce the efficiency of DRS

Implications

1. Where the EVC level for a cluster is increased, virtual machines will not leverage new CPU features unmasked by EVC until the next reboot
2. In the event new hardware is added to a cluster and the new hardware is compatible with a higher EVC mode, a virtual machine which has a workload which can benefit from CPU features masked by the existing EVC mode may not perform at the optimal level until older hardware is removed from the cluster and the EVC mode increased.

Alternatives

1. Leave EVC disabled and where CPU types are not compatible to vMotion, shutdown the guest OS for migrations.

Common Mistake: Inefficient cluster sizes

Link

In my day job, I regularly come across environments which are running poorly and have inefficient designs.

One of the most common issues I see is VMware environments which cannot power on VMs due to being out of compute resources, but not for the reasons you may expect.

While the environments may have less than optimal HA settings / policies, the most common issues I see is customers (for whatever reason) having multiple clusters with only a few nodes. (ie: 2/3/4 etc)

Some of the time, there are corporate policies which may require this type of setup, but alot of the time, you can comply with these policies while still optimizing the environment.

It seems that even with virtualisation having been common place for many years, the basics are still mis-understood by a significant percentage of industry professionals. I have heard comments event recently saying you need 2 node clusters for maximum HA efficiency, They couldn’t be more Wrong!

So, why are small clusters a potential problem?

Depending on what HA setting you choose (Host failures cluster tolerates , Percentage of cluster resources reserved for HA, or Failover Host/s), the clusters have a large amount of “waste”.

What is “Waste”?

“Waste”, is the amount of the compute power within the cluster, that cannot be used to ensure in a HA event, VMs can be restarted on the remaining hosts.

Now at this stage, let me point out, some “Waste” is a good thing. We need to have some spare capacity for HA events, but the challenge is to minimize the waste without compromising HA.

So, in a recent environment I reviewed, there was 4 clusters using similar IBM x3850 Servers.

Cluster 1 : 2 Nodes

Cluster 2 : 2 Nodes

Cluster 3: 3 Nodes

Cluster 4 : 2 Nodes

In all clusters, HA was enabled (as it should be) and the HA admission control setting was “Percentage of Cluster resources reserved for HA” (which I prefer).

The 2 node clusters HA reservation percentage was set to 50%, and the 3 node cluster was 33%, which would be the settings I would choose if I had to stick with the 4 cluster design.

Because the environment (in its current state) was unable to host any more VMs, the customer wanted to purchase another 2 new Hosts, and form a new cluster.

At this stage we have the equivalent of 4 hosts of “waste” within the environment, and with a new cluster we would have 5 hosts “wasted”.

Now after a quick check of the VMware EVC KB: 1003212 all CPUs are compatible with EVC and support the EVC mode “Intel® “Merom” Generation”.

So, we can form a single new cluster using the existing 9 hosts and maintain full cluster functionality by enabling EVC.

Lets assume the hosts are all in a cluster and we’re configuring HA, How do we ensure we have more available compute for the new virtual machines?

Simple, we Enable HA (as you always should), Enable admission control, and set the HA policy to “Percentage of Cluster resources reserved for HA, But what percentage should we choose?

Well, it depends of what level of redundancy you require.

Generally, I recommend for

<8 hosts = N+1 – Note: If you require N+1 during maintenance you need N+2

>8 hosts < 16 hosts = N+2

>16 hosts <24 hosts = N+3

>24 hosts = N+4

The reason for the above, is as you add more hosts, your chance of a host failure, and a subsequent host failure increases. Therefore the more hosts you have, the more redundancy you need, Similar concept to RAID.

So in this example, we’re right on the line in terms of N+1 or N+2.

Lets be conservative, and choose N+2, therefore setting “Percentage of Cluster resources reserved for HA” to 22% (N+2 is actually 22.5%, but we use round numbers).

So what have we achieved?

The previous setup had only N+1 and an average HA overhead of 45.75% (50%+50%+50%+33% divide 4).

The new 9 node cluster now with N+2 redundancy and only has an overhead of 22%. A NET gain of 23.75% of available compute resources without purchasing new hardware.

What else do we gain by having a single larger cluster:

1. Increased DRS flexibility

2. Increase redundancy (previously N+1, now N+2)

3. Less chance of contention

4. No need to purchase new hardware!!

The above is a simple example of how to increase efficiency within a VMware environment without purchasing new hardware.

Now for those of you wanting to know more about HA/DRS, this has been covered in great detail in other blogs, I would recommend you first have a read of the following blog and get a copy of “vSphere 5.0 Clustering technical deep dive” book.

Yellow Bricks (Duncan Epping) – HA Admission control Pros and Cons