Example Architectural Decision – Transparent Page Sharing (TPS) Configuration for Virtualized Business Critical Applications (vBCA)

Problem Statement

In a VMware vSphere environment, with future releases of ESXi disabling Transparent Page Sharing by default, what is the most suitable TPS configuration for an environment running Virtualized Business Critical Applications?

Assumptions

1. TPS is disabled by default.
2. Storage is expensive.
3. HA Admission Control policy used is “Percentage of Cluster Resources reserved for HA”
4. vSphere 5.5 or earlier

Requirements

1. Applications must meet strict Service Level Agreements (SLAs)
2. The environment must deliver high consistent performance
3. Minimize the cost of shared storage

Motivation

1. Reduce complexity where possible.
2. Maximize the efficiency of the infrastructure
3. Meet/Exceed SLAs

Architectural Decision

Leave TPS disabled (default) and leave Large Memory pages enabled (default).

Justification

1. Setting 100% memory reservations ensures consistent performance by eliminating the possibility of swapping.
2. The 100% memory reservation also eliminates the capacity usage by the vswap file which saves space on the shared storage as well as reducing the impact on the storage in the event of swapping.
3. RAM is cheaper than Tier 1 storage (which is recommended for vSwap storage to ensure minimal performance impact during swapping) so the increased cost of memory in the hosts is easily offset by the saving in Tier 1 shared storage.
4. Simplicity. Leaving default settings is advantageous from both an architectural and operational perspective.  Example: ESXi Patching can cause settings to revert to default which could negate TPS savings and put a sudden high demand on storage where TPS savings may be expected.
5. TPS savings for vBCA workloads is typically much less than with mixed server or desktop workloads due to the memory requirements for vBCAs being typically much higher.
6. HA admission control will calculate fail-over requirements (when using Percentage of cluster resources reserved for HA) so that performance will be approximately the same in the event of a fail-over due to reserving the full RAM reserved for every VM leading to more consistent performance under a wider range of circumstances.
7. Remove the real or perceived security risk of sensitive information being gathered from other VMs using TPS as described in VMware KB 2080735
8. Many business critical applications such as SAP and MS SQL use in Memory caching, overcommitment of memory could lead to serious degradation in performance.
9. Many business critical applications such as MS SQL claim by default all available RAM in the Virtual Machine, as such, TPS savings would be minimal to none for SQL servers.

Implications

1. Using 100% memory reservations requires ESXi hosts and the cluster be sized at a 1:1 ratio of vRAM to pRAM (Physical RAM) and should include N+1 so a host failure can be tolerated.
2. Potential Increased RAM costs
3. No memory overcommitment can be achieved
4. Potential for lower CPU utilization / overcommitment as RAM may become the first constraint.

Alternatives

1. Use 50% reservation and enable TPS
2. Use no reservation, Enable TPS and disable large pages

Summary

I highly recommend sizing vBCA’s with no memory overcommitment in mind and setting 100% memory reservations for these VMs. By definition, these are “Business Critical” and the requirement of consistent high performance far outweigh potential memory savings!

In the event vBCAs are running in a cluster with no vBCAs such as Server or even Test/Dev where TPS may be enabled or desired, TPS can be enabled along with disabling large pages but vBCAs should always have 100% memory reservation.

Related Articles:

1. Cloud XC Transparent Page Sharing Example Architectural Decisions

2. The Impact of Transparent Page Sharing (TPS) being disabled by default @josh_odgers (VCDX#90)

3. Future direction of disabling TPS by default and its impact on capacity planning –@FrankDenneman (VCDX #29)

4. Transparent Page Sharing Vulnerable, Yet Largely Irrelevant – @ChrisWahl (VCDX#104)

Cost vs Reward for the Nutanix Controller VM (CVM)

I hear a lot of FUD (Fear Uncertainty and Doubt) getting thrown around about the Nutanix Controller VM (CVM) being a resource (vCPU/vRAM) hog.

So I thought I would address this perceived issue.

For those of you who are car people, you will understand the benefits of a Supercharger increasing performance of an engine.

The supercharger does this by attaching a belt to a pulley connected to the motor which spins the supercharger to force more air into the combustion chambers. This allows more fuel to be added to the mix to produce higher horsepower from the same engine displacement (engine capacity, ie: 2.0 Litres)

What is downside of a Supercharger?

The supercharger belt connected to the pulley can require even hundreds of horsepower to simply drive the supercharger. As such, a 300HP engine may have to use half of its power to just drive the supercharger.

So for example, a 300HP engine less 60HP (25%) to drive the supercharger equates to only 240HP remaining. But, as a result of the supercharger forcing more air into the engine, the engine now produces an additional 200HP.

So the “cost” of running the supercharger is 60HP, but the overall benefit is 200HP, resulting in the engine now producing 440HP.

Let’s now relate this back to the Nutanix Controller VM (CVM).

The CVM provides the storage features,functionality,excellent scalability and performance for the Virtual Machines. For example, reducing the latency thanks to Data Locality keeping data local to the compute node running the VM for faster reads and writes.

The faster the reads and writes, the less time VMs spend in a “CPU wait” state waiting for I/Os to be acknowledged by the storage which means the CPUs are being more efficiently utilized. This is a small part of the value the Nutanix CVM provides.

In Summary, the CVM does use some compute resources from the host (which depend on the node type and performance required) but like a Supercharger to an engine, the Nutanix CVM delivers significantly higher value to the VMs than the resources it uses.

Related Articles:

1. Rule of Thumb: Sizing for Storage Performance in the new world.

2. Is VAAI beneficial with Virtual Storage Appliance (VSA) based solutions ?

3. PART 1 – Problems with RAID and Object Based Storage for data protection

Example Architectural Decision – ESXi Host Hardware Sizing (Example 1)

Problem Statement

What is the most suitable hardware specifications for this environments ESXi hosts?

Requirements

1. Support Virtual Machines of up to 16 vCPUs and 256GB RAM
2. Achieve up to 400% CPU overcommitment
3. Achieve up to 150% RAM overcommitment
4. Ensure cluster performance is both consistent & maximized
5. Support IP based storage (NFS & iSCSI)
6. The average VM size is 1vCPU / 4GB RAM
7. Cluster must support approx 1000 average size Virtual machines day 1
8. The solution should be scalable beyond 1000 VMs (Future-Proofing)
9. N+2 redundancy

Assumptions

1. vSphere 5.0 or later
2. vSphere Enterprise Plus licensing (to support Network I/O Control)
3. VMs range from Business Critical Application (BCAs) to non critical servers
4. Software licensing for applications being hosted in the environment are based on per vCPU OR per host where DRS “Must” rules can be used to isolate VMs to licensed ESXi hosts

Constraints

1. None

Motivation

1. Create a Scalable solution
2. Ensure high performance
3. Minimize HA overhead
4. Maximize flexibility

Architectural Decision

Use Two Socket Servers w/ >= 8 cores per socket with HT support (16 physical cores / 32 logical cores) , 256GB Ram , 2 x 10GB NICs

Justification

1. Two socket 8 core (or greater) CPUs with Hyper threading will provide flexibility for CPU scheduling of large numbers of diverse (vCPU sized) VMs to minimize CPU Ready (contention)

2. Using Two Socket servers of the proposed specification will support the required 1000 average sized VMs with 18 hosts with 11% reserved for HA to meet the required N+2 redundancy.

3. A cluster size of 18 hosts will deliver excellent cluster (DRS) efficiency / flexibility with minimal overhead for HA (Only 11%) thus ensuring cluster performance is both consistent & maximized.

4. The cluster can be expanded with up to 14 more hosts (to the 32 host cluster limit) in the event the average VM size is greater than anticipated or the customer experiences growth

5. Having 2 x 10GB connections should comfortably support the IP Storage / vMotion / FT and network data with minimal possibility of contention. In the event of contention Network I/O Control will be configured to minimize any impact (see Example VMware vNetworking Design w/ 2 x 10GB NICs)

6. RAM is one of the most common bottlenecks in a virtual environment, with 16 physical cores and 256GB RAM this equates to 16GB of RAM per physical core. For the average sized VM (1vCPU / 4GB RAM) this meets the CPU overcommitment target (up to 400%) with no RAM overcommitment to minimize the chance of RAM becoming the bottleneck

7. In the event of a host failure, the number of Virtual machines impacted will be up to 64 (based on the assumed average size VM) which is minimal when compared to a Four Socket ESXi host which would see 128 VMs impacted by a single host outage

8. If using Four socket ESXi hosts the cluster size would be approx 10 hosts and would require 20% of cluster resources would have to be reserved for HA to meet the N+2 redundancy requirement. This cluster size is less efficient from a DRS perspective and the HA overhead would equate to higher CapEx and as a result lower the ROI

9. The solution supports Virtual machines of up to 16 vCPUs and 256GB RAM although this size VM would be discouraged in favour of a scale out approach (where possible)

10. The cluster aligns with a virtualization friendly “Scale out” methodology

11. Using smaller hosts (either single socket, or less cores per socket) would not meet the requirement to support supports Virtual machines of up to 16 vCPUs and 256GB RAM , would likely require multiple clusters and require additional 10GB and 1GB cabling as compared to the Two Socket configuration

12. The two socket configuration allows the cluster to be scaled (expanded) at a very granular level (if required) to reduce CapEx expenditure and minimize waste/unused cluster capacity by adding larger hosts

13. Enabling features such as Distributed Power Management (DPM) are more attractive and lower risk for larger clusters and may result in lower environmental costs (ie: Power / Cooling)

Alternatives

1.  Use Four Socket Servers w/ >= 8 cores per socket , 512GB Ram , 4 x 10GB NICs
2.  Use Single Socket Servers w/ >= 8 cores , 128GB Ram , 2 x 10GB NICs
3. Use Two Socket Servers w/ >= 8 cores , 512GB Ram , 2 x 10GB NICs
4. Use Two Socket Servers w/ >= 8 cores , 384GB Ram , 2 x 10GB NICs
5. Have two clusters of 9 hosts with the recommended hardware specifications

Implications

1. Additional IP addresses for ESXi Management, vMotion, FT & Out of band management will be required as compared to a solution using larger hosts

2. Additional out of band management cabling will be required as compared to a solution using larger hosts

Related Articles

1. Example Architectural Decision – Network I/O Control for ESXi Host using IP Storage (4 x 10 GB NICs)

2. Example VMware vNetworking Design w/ 2 x 10GB NICs

3. Network I/O Control Shares/Limits for ESXi Host using IP Storage

4. VMware Clusters – Scale up for Scale out?

5. Jumbo Frames for IP Storage (Do not use Jumbo Frames)

6. Jumbo Frames for IP Storage (Use Jumbo Frames)

CloudXClogo