How to successfully Virtualize MS Exchange – Part 7 – Storage Options

When virtualizing Exchange, we not only have to consider the Compute (CPU/RAM) and Network, but also the storage to provide both the capacity and IOPS required.

However before considering IOPS and capacity, we need to decide how we will provide storage for Exchange as storage can be presented to a Virtual Machine in many ways.

This post will cover the different ways storage can be presented to ESXi and used for Exchange while subsequent posts will cover in detail each of the options discussed.

First lets discuss Local Storage.

What I mean by Local Storage is SSD/HDDs within a physical ESXi hosts that is not shared (e.g.: Not accessible by other hosts).

This is probably the most basic form of storage we can present to ESXi and apart from the Hypervisor layer could be considered similar to a physical Exchange deployment.

UseLocalStorage

Next lets discuss Raw Device Mappings.

Raw Device Mappings or “RDMs” are where shared storage from a SAN is presented through the hypervisor to the guest as a native SCSI device and enables.

RDMs

For more information about Raw Device Mappings, see: About Raw Device Mappings

The next option is Presenting Storage direct to the Guest OS.

It is possible and sometime advantageous to presents SAN/NAS storage direct to the Guest OS via NFS , iSCSI or SMB 3.0 and bypasses the hyper-visor all together.

DirectInGuest

The final option we will discuss is “Datastores“.

Datastores are probably the most common way to present storage to ESXi. Datastores can be Block or File based, and presented via iSCSI , NFS or FCP (FC / FCoE) as of vSphere 5.5.

Datastores are basically just LUNs or NFS mounts. If the datastore is backed by a LUN, it will be formatted with Virtual Machine File System (VMFS) whereas NFS datastores are simply NFS 3 mounts with no formatting done by ESXi.

ViaDatastore

For more information about VMFS see: Virtual Machine File System Technical Overview.

What do all the above options have in common?

Local storage, RDMs, storage presented to the Guest OS directly and Datastores can all be protected by RAID or be JBOD deployments with no data protection at the storage layer.

Importantly, none of the four options on their own guarantee data protection or integrity, that is, prevent data loss or corruption. Protecting from data loss or corruption is a separate topic which I will cover in a non Exchange specific post.

So regardless of the way you present your storage to ESXi or the VM, how you ensure data protection and integrity needs to be considered.

In summary, there are four main ways (listed below) to present storage to ESXi which can be used for Exchange each with different considerations around Availability, Performance, Scalability, Cost , Complexity and support.

1. Local Storage (Part 8)
2. Raw Device Mappings  (Part 9)
3. Direct to the Guest OS (Part 10)
4. Datastores (Part 11)

In the next four parts, each of these storage options for MS Exchange will be discussed in detail.

Back to the Index of How to successfully Virtualize MS Exchange.

Example Architectural Decision – Port Binding Setting for a dvPortGroup

Problem Statement

In a VMware vSphere environment using Virtual Distributed Switches (VDS) where all VMs including vCenter is hosted on the VDS, What is the most suitable Port Binding setting for dvPortgroups to ensure maximum performance and availability?

Assumptions

1. Enterprise Plus Licensing
2. vCenter is hosted on the VDS

Requirements

1. The environment must have central management of vNetworking
2. All VMs must be able to be powered on in the event of a vCenter outage
3. Network connectivity must not be impacted if vCenter is down.

Motivation

1. Reduce complexity where possible.
2. Maximize the availability of the infrastructure

Architectural Decision

Use the default dvPortGroup Port Binding setting of “Static Binding”

Justification

1. A dvPortGroup port is assigned to a VM and reserved when a VM is connected to the dvPortGroup. This ensures connectivity at all times including when vCenter is down.
3. Using “Static Binding” ensures the vCenter VM can be powered on and connected to the dvPortGroup even after a failure/outage.
4. “Static Binding” is the default setting and there is no reason to modify this setting.

Implications

1. The number of VMs supported on the dvPortGroup / VDS is limited to the number ports on the VDS (not overcommitment of ports is possible).
2. Number of ports configured on a dvPortGroup should be greater than the maximum number of VMs required to be supported.
3. Port Allocation should be left at the default of “Elastic” to ensure the number of ports is automatically expanded if/when required.
4. New Virtual machines cannot be powered on and connected to a dvPortGroup (VDS) when vCenter is down.

Alternatives

1. Set dvPortGroup Port Binding to “Dynamic binding”
2. Set dvPortGroup Port Binding to “Ephemeral binding”

Related Articles

1. Distributed vSwitches and vCenter outage, what’s the deal? – @duncanyb (VCDX #007)

2. Choosing a port binding type in ESX/ESXi (1022312)

Example Architectural Decision – Transparent Page Sharing (TPS) Configuration for VDI (1 of 2)

Problem Statement

In a VMware vSphere environment, with future releases of ESXi disabling Transparent Page Sharing by default, what is the most suitable TPS configuration for a Virtual Desktop environment?

Assumptions

1. TPS is disabled by default
2. Storage is expensive
3. Two Socket ESXi Hosts have been chosen to align with a scale out methodology.
4. HA Admission Control policy used is “Percentage of Cluster Resources reserved for HA”
5. vSphere 5.5 or earlier

Requirements

1. VDI environment must deliver consistent performance
2. VDI environment supports a high percentage of Power Users

Motivation

1. Reduce complexity where possible.
2. Maximize the efficiency of the infrastructure

Architectural Decision

Leave TPS disabled (default) and apply 100% Memory Reservations to VDI VMs and/or Golden Master Image.

Justification

1. Setting 100% memory reservations ensures consistent performance by eliminating the possibility of swapping.
2. The 100% memory reservation also eliminates the capacity usage by the vswap file which saves space on the shared storage as well as reducing the impact on the storage in the event of swapping.
3. RAM is cheaper than Tier 1 storage (which is recommended for vSwap storage to ensure minimal performance impact during swapping) so the increased cost of memory in the hosts is easily offset by the saving in shared storage.
4. Simplicity. Leaving default settings is advantageous from both an architectural and operational perspective.  Example: ESXi Patching can cause settings to revert to default which could negate TPS savings and put a sudden high demand on storage where TPS savings are expected.
5. TPS savings for desktops can be significant, however with a high percentage of Power Users with >=4GB desktops and 2vCPUs, the TPS savings are lower compared to Kiosk or Task users typically with 1-2GB per desktop.
6. The decision has been made to use 2 socket ESXi hosts and scale out so the TPS savings per host compared to a 4 socket server with double the RAM will be lower.
7. HA admission control will calculate fail-over requirements (when using Percentage of cluster resources reserved for HA) so that performance will be approximately the same in the event of a fail-over due to reserving the full RAM reserved for every VM leading to more consistent performance under a wider range of circumstances.
8. Lower core count (and lower cost) CPUs will likely be viable as RAM will likely be the first constraint for further consolidation.

Implications

1. Using 100% memory reservations requires ESXi hosts and the cluster be sized at a 1:1 ratio of vRAM to pRAM (Physical RAM) and should include N+1 so a host failure can be tolerated.
2. Increased RAM costs
3. No memory overcommitment can be achieved
4. Potential for lower CPU utilization / overcommitment as RAM may become the first constraint.

Alternatives

1. Use 50% reservation and enable TPS
2. Use no reservation, Enable TPS and disable large pages

Related Articles:

1. The Impact of Transparent Page Sharing (TPS) being disabled by default @josh_odgers (VCDX#90)

2. Example Architectural Decision – Transparent Page Sharing (TPS) Configuration for VDI (2 of 2)

3. Future direction of disabling TPS by default and its impact on capacity planning –@FrankDenneman (VCDX #29)

4. Transparent Page Sharing Vulnerable, Yet Largely Irrelevant – @ChrisWahl(VCDX#104)