High CPU Ready with Low CPU Utilization?

I have noticed an increasing amount of search engine terms which results in people accessing my blog similar to

* High CPU Ready Low CPU usage
* CPU ready and Low utilization
* CPU ready relationship to utilization

So I wanted to try and clear this issue up.

First lets define CPU Ready & CPU Utilization.

CPU ready (percentage) is the percentage of time a virtual machine is waiting to be scheduled onto a physical (or HT) core by the CPU scheduler.

CPU utilization measures the amount of Mhz or Ghz that is being used.

Next to find out how much CPU ready is ok, check out my post How Much CPU ready is OK?

CPU Ready and CPU utilization have very little to do with each other, high CPU utilization does not mean you will have high CPU ready, and vice versa.

So it is entirely possible to have either of the below scenarios

Scenario 1 : An ESXi host has 20% CPU utilization and VMs to suffer high CPU ready (>10%).
Scenario 2: An ESXi host has 95% CPU utilization and VMs to have little or no CPU ready (<2.5%)

How are the above two scenarios possible?

Scenario 1 may occur when

* One or more VMs are oversized (ie: not utilizing the resources they are assigned)
* The host (or cluster) is highly overcommited (either with or without right sized VMs)
* Where power management settings are set to Balanced / Low Power or custom

Scenario 2 may occur when

* VMs are correctly sized
* The ESXi hosts are well sized for the virtual machine workloads
* The VM to host ratio has been well architected

So the question on everyone lips, How can high CPU ready with Low CPU utilization be addressed/avoided?

If you have a situation where you are experiencing high CPU ready and low ESXi host utilization the following steps should be taken

* Right size your VMs

This is by far the most important thing to do. I Recommend using a tool such as vCenter Operations to assist with determining the correct size for VMs.

* Ensure your hosts/clusters are not excessively overcommited

I generally find 4:1 vCPU overcommitment is achievable with right sized VMs where the avg VM size is <4 vCPUs. The higher the vCPU per VM average, the lower CPU overcommitment you will achieve.)
If you have an average VM size of 8 vCPUs then you may only see <1.5:1 overcommitment before suffering contention (CPU ready).

* Use DRS affinity rules to keep complimentary workloads together
VMs with high CPU utilization and VMs with very low CPU utilization can work well together. You  also may have an environment where some servers are busy overnight and others are only busy during business hours, these are examples of workload to keep together.

* Use DRS anti-affinity rules to keep non-complimentary workloads apart

VMs with very high CPU utilization (assuming the high utilization is at the same time) can be spread over a number of hosts to avoid stress on the CPU scheduler.

* Ensure your ESXi hosts are chosen with the virtual machine workloads in mind
If your VMs are >=8vCPUs choose a CPU with >=8 cores per socket and more sockets per host, like 4 socket hosts as opposed to 2 socket hosts. If the bulk of your VMs are 1 or 2 vCPUs, then even older 2 socket 4 core processors should generally work well.

* Use Hyperthreading
Assuming you have a mix of workloads and not all VMs require large amounts of cores and Ghz, using hyper threading increases the efficiency of the CPU schedulure by effectively doubling the scheduling opportunities. Note: A HT core will generally give much less than half the performance of a pCore.

* Use “High Performance” for your Power Management Policy

The above seven (7) steps should resolve the vast majority of issues with CPU ready.

For an example of the benefits of right sizing your VMs, check out my earlier post – VM Right Sizing , An example of the benefits.

Also please note, using CPU reservations does not solve CPU ready, I have also written an article on this topic – Common Mistake – Using CPU reservations to solve CPU ready

I hope this helps clear up this issue.

VM Right Sizing – An example of the benefits

I thought this example may be useful to show the benefits of Right sizing a virtual machine.

The VM is an SQL Database server with 4 vCPUs on a cluster which is highly overcommitted with lots of oversized VMs.

As we can see by the below graph, the CPU ready was more or less averaging 10% and on the 24th of July most vCPUs spiked to greater than 30% CPU ready each. ie: 30% of the time the server is waiting to be scheduled onto the pCPU cores.

The performance of applications using databases hosted on the server were suffering serious issues during this time.

On the  24th the VM was dropped from 4 vCPUs, down to 2 vCPUs and the results are obvious.

CPU ready dropped immediately (even in a heavily over-committed environment) to around 1% and CPU utilization remained at around the same levels. Performance also improved for applications (for example vCenter) using the database server.

TIP: Right Sizing not only helps the VM you right size, but it helps relieve the contention on the ESXi host (and cluster), which will improve performance for all VMs.

It is also important to point out this VM is the first VM to be right sized, so as more VMs are right sized in the cluster, Ready time will drop further and performance will continue to improve.

This also results in opportunities for greater consolidation within the environment without compromising performance or redundancy.

I would like to point out that I believe this server may benefit from 4 vCPUs, but definitely not in this highly CPU contended environment.

As more virtual machines are Right Sized, then this environment would likely have the opportunity to consider increasing vCPUs in suitable VMs after monitoring performance for a suitable period of time. Products like VMware vCenter Operations is excellent for reporting on Oversized and undersized VMs.

Do you believe in right sizing now?