How to successfully Virtualize MS Exchange – Part 10 – Presenting Storage direct to the Guest OS

Let’s start with listing three common storage types which can be presented direct to a Windows OS?

1. iSCSI LUNs
2. SMB 3.0 shares
3. NFS mounts

Next let’s discuss these 3 options.

iSCSI LUNs are a common way of presenting storage direct to the Guest OS even in vSphere environments and can be useful for environments using storage array level backup solutions (which will be discussed in detail in an upcoming post).

The use of iSCSI LUNs is fully supported by VMware and Microsoft as iSCSI meets the technical requirements for Exchange, being Write Ordering, Forced Unit Access (FUA) and SCSI abort/reset commands. iSCSI LUNs presented to Windows are then formatted with NTFS which is a journalling file system which also protects against Torn I/O.

In vSphere environments nearing the configuration maximum of 256 datastores per ESXi host (and therefore HA/DRS cluster) presenting iSCSI LUNs to applications such as Exchange can help ensure scalability even where vSphere limits may have been reached.

Note: I would recommend reviewing the storage design and trying to optimize VMs/LUN etc first before using iSCSI LUNs presented to VMs.

The problem with iSCSI LUNs is they result in additional complexity compared to using VMDKs on Datastores (discussed in Part 11). The complexity is not insignificant as typically multiple LUNs need to be created per Exchange VM, things like iSCSI initiators and LUN masking needs to be configured. Then when the iSCSI initiator driver is updated (say via Windows Update) you may find your storage disconnected and you may need to troubleshoot iSCSI driver issues. You also need to consider the vNetworking implications as the VM now needs IP connectivity to the storage network.

I wrote this article (Example VMware vNetworking Design w/ 2 x 10GB NICs for IP Storage) a while ago showing an example vNetworking design that supports IP storage with 2 x 10GB NICs.

The above article shows NFS on the dvPortGroup name but the same configuration is also optimal for iSCSI. Each Exchange VM would then need a 2nd vmNIC connected to the iSCSI portgroup (or dvPortgroup) ideally with a static IP address.

IP addressing is another complexity added by presenting storage direct to VMs rather than using VMDKs on datastores.

Many system administrators, architects and engineers might scoff at the suggestion iSCSI is complex, but in my opinion while I don’t find iSCSI at all difficult to design/install/configure and use, it is significantly more complex and has many more points of failure than using a VMDK on a Datastore.

One of the things I have learned and seen benefit countless customers over the years is keeping things as simple as possible while meeting the business requirements. With that in mind, I recommend only considering the use of iSCSI direct to the Guest OS in the following situations:

1. When using a Backup solution which triggers a storage level snapshot which is not VM or VMDK based. i.e.: Where snapshots are only support at the LUN level. (Older storage technologies).
2. Where ESXi scalability maximums are going to be reached and creating a separate cluster is not viable (technically and/or commercially) following a detailed review and optimization of storage for the vSphere environment.
3. When using legacy storage architecture where performance is constrained at a datastore level. e.g.: Where increasing the number of VMs per Datastore impacts performance due to latency created from queue depth or storage controller contention.

Next let’s discuss SMB 3.0 / CIFS shares.

SMB 3.0 or CIFS shares are commonly used to present storage for Hyper-V and also file servers. However presenting SMB 3.0 directly to Windows is not a supported configuration for MS Exchange because SMB 3.0 presented to the Guest OS directly does not meet the technical requirements for Exchange, such as Write Ordering, Forced Unit Access (FUA) and SCSI abort/reset commands.

However SMB 3.0 is supported for MS Exchange when presented to Hyper-V and where Exchange database files reside within a VHD which emulates the SCSI commands over the SMB file protocol. This will be discussed in the upcoming Hyper-V series.

The below is a quote from Exchange 2013 storage configuration options outlining the storage support statement for MS Exchange.

All storage used by Exchange for storage of Exchange data must be block-level storage because Exchange 2013 doesn’t support the use of NAS volumes, other than in the SMB 3.0 scenario outlined in the topic Exchange 2013 virtualization. Also, in a virtualized environment, NAS storage that’s presented to the guest as block-level storage via the hypervisor isn’t supported.

The above statement is pretty confusing in my opinion, but what Microsoft mean by this is SMB 3.0 is supported when presented to Hyper-V with Exchange running in a VM with its databases housed within one or more VHDs. However to be clear presenting SMB 3.0 direct to Windows for Exchange files is not supported.

NFS mounts can be used to present storage to Windows although this is not that common. Its important to note presenting NFS directly to Windows is not a supported configuration for MS Exchange and as with SMB 3.0, presenting NFS to Windows directly also does not meet the technical requirements for Exchange, being Write Ordering, Forced Unit Access (FUA) and SCSI abort/reset commands. iSCSI LUNs can be formatted with VMFS which is a journalling file system which also protects against Torn I/O.

As such I recommend not presenting NFS mounts to Windows for Exchange storage.

Note: Do not confuse presenting NFS to Windows which presenting NFS datastores to ESXi as these are different. NFS datastores will be discussed in Part 11.

Summary:

iSCSI is the only supported storage protocol to present storage direct to Windows for storage of Exchange databases.

Lets now discuss the Pros and Cons for presenting iSCSI storage direct to the Guest OS.

PROS

1. Ability to reduce overheads of legacy LUN based snapshot based backup solutions by having MS Exchange use dedicated LUN/s therefore reducing delta changes that need to be captured/stored. (e.g.: Netapp SnapManager for Exchange)
2. Does not impact ESXi configuration maximums for LUNs per ESXi host as storage is presented to the Guest OS and not the hypervisor
3. Dedicated LUN/s per MS Exchange VM can potentially improve performance depending on the underlying storage capabilities and design.

CONS

1. Complexity e.g.: Having to create, present and manage LUN/s per Exchange MBX/MSR VMs
2. Having to manage and potentially troubleshoot iSCSI drivers within a Guest OS
3. Having to design for IP storage traffic to access VMs directly, which requires additional vNetworking considerations relating to performance and availability.

Recommendations:

1. When choosing to present storage direct to the Guest OS, only iSCSI is supported.
2. Where no requirements or constraints exist that require the use of storage presented to the Guest OS directly, use VMDKs on Datastores option which is discussed in Part 11.
3. Use a dedicated vmNIC on the Exchange VM for iSCSI traffic
4. Use NIOC to ensure sufficient bandwidth for iSCSI traffic in the event of network congestion. Recommended share values along with justification can be found in Example Architectural Decision – Network I/O Control Shares/Limits for ESXi Host using IP Storage.
5. Use a dedicated VLAN for iSCSI traffic
6. Do NOT present SMB 3.0 or NFS direct to the Guest OS and use for Exchange Databases!

Back to the Index of How to successfully Virtualize MS Exchange.

How to successfully Virtualize MS Exchange – Part 9 – Raw Device Mappings (RDMs)

A Raw Device Mapping or “RDM” allows a VM to access a volume (or LUN) on the physical storage via either Fibre Channel or iSCSI.

When discussing Raw Device Mappings, it is important to highlight there are two types of RDM modes, Virtual Compatability Mode and Physical Compatibility Mode.

See the following article for a detailed breakdown of the Difference between Physical compatibility RDMs and Virtual compatibility RDMs(2009226).

So how does an RDM compare to a VMDK on a Datastore?

VMware released a white paper called Performance Characterization of VMFS and RDM Using a SAN in 2008, which debunked the myth that RDMs gave significantly higher performance than VMDKs on datastores.

So RDMs have NO performance advantage over VMDKs on a Datastore.

With that in mind, what advantages (if any) do RDMs have today?

VMware released their Microsoft Exchange 2010 on VMware
Best Practices Guide which has the following Table on page 14 showing the trade-offs between VMFS and RDMs.

RDMsvsVMDKs

I have highlighted one advantage that RDMs still have over VDMKs on datastore style deployments which is the ability to migrate from a physical Exchange server using centralized SAN storage to a VM without data migration.

However, I find the most common way to migrate from a physical deployment to a virtual deployment is by performing Mailbox migrations to virtualized Exchange servers in an ESXi environment. This avoids the complexities of RDMs and ensures no capacity on the shared storage is wasted (i.e.: Siloed).

The table also lists one other advantage for RDMs, being support for up to 64TB drives whereas Virtual Disks were limited to 2TB for VMFS but this limitation has since been lifted to 62TB in vSphere 5.5.

Recommendation: Do not use RDMs for MS Exchange deployments.

As with Local Storage discussed in Part 7, RDM deployments have more downsides (mainly around inefficiency and complexity) than upsides and I would recommend considering other storage options for Virtualized Exchange deployments.

Other options along with my recommended options will be discussed in the next 2 parts of this series and in upcoming posts on Storage performance and resiliency.

Back to the Index of How to successfully Virtualize MS Exchange.

Integrity of I/O for VMs on NFS Datastores – Part 1 – Emulation of the SCSI Protocol

This is the first of a series of posts covering how the Integrity of I/O is ensured for Virtual Machines when writing to VMDK/s (Virtual SCSI Hard Drives) running on NFS datastores presented via VMware’s ESXi hypervisor as a “Datastore”.

Note: To be crystal clear, this post is not talking about presenting NFS direct to Windows or any other guest operating system.

This process is patented (US7865663) by VMware and its inventors and on the patent the process is called “SCSI Protocol Emulation”.

This series will first cover the topics in a vendor agnostic manner, meaning I am talking in general about VMware + any NFS storage on the VMware HCL with NFS support.

Following the vendor agnostic posts, I will follow with a series of posts focusing specifically on Nutanix, as the motivation for the series was to cover off this topic for existing or potential Nutanix customers, some of whom are less familiar with NFS and have asked for clarification, especially around virtualizing Business Critical Applications (vBCA) such as Microsoft SQL and Exchange.

The below diagram visualizes shows how storage can be presented to an ESXi host and what this series will focus on.

A VM accesses its .vmx and .vmdk file/s via a datastore the same way, regardless of the underlying storage protocol (DAS SCSI, iSCSI , NFS , FCP).

GUID-AD71704F-67E4-4AC2-9C22-10B531755566-high

In the case of NFS datastores, SCSI protocol emulation is used to allow the Guest Operating System (OS) and application/s to read and write via SCSI even when the underlying storage (which is abstracted by the hypervisor) is served via NFS which does not natively support the same commands.

Image Source: https://pubs.vmware.com/vsphere-50/index.jsp?topic=%2Fcom.vmware.vsphere.introduction.doc_50%2FGUID-2E7DB290-2A07-4F54-9199-B68FCB210BBA.html

In the following section, and throughout this series, many images shown are from the patent (US7865663) and are the property of the patent owners, not the author of this article.

The areas which I will be focusing on are the ones where there has been the most concern in the industry, especially for business critical applications, such as Microsoft SQL and Microsoft Exchange, being how are the VM operating system and application/s (or data integrity) are impacted when issuing commands when the storage is abstracted by the hypervisor and served to via NFS which does not have equivalent I/O commands as SCSI.

Some examples areas of concern around the industry for VMs running on datastores backed by NFS are:

1. SCSI Aborts / Resets
2. Forced Unit Access (FUA) & Write Through
3. Write Ordering
4. Torn I/O (Writes + Reads)

In this first part, we will look at the SCSI Protocol Emulation process and discuss SCSI Aborts and Resets and how the SCSI protocol emulation process deals with these.

Below is a diagram showing the flow of an I/O request for a VM writing SCSI commands to a VMDK (formatted as NTFS) through the SCSI emulation process and through to the NFS storage.

US07865663-20110104-D00005

The first few steps in my opinion are fairly self explanatory, where it gets interesting for me, and one of the points of contention among I.T professional (being SCSI aborts) is described in the box labelled “550“.

If the SCSI command is an abort (which has no equivalent in the NFS protocol), the SCSI emulation process removes the corresponding request from the virtual SCSI request list created in the previous step (box labelled “540“).

The same is true if the SCSI command is a reset (which also has no equivalent in the NFS protocol), the SCSI emulation process removes the corresponding request from the virtual SCSI request list. This process is shown below in the box labelled “560

US07865663-20110104-D00006

Next lets look at what happens if the SCSI “abort” or “reset” command is issued after the SCSI emulation process has passed on the command to the storage and is now receiving a reply to a command which the Guest OS / Application has aborted?

Its quite simple, the SCSI emulation process receives a reply from the NFS server, looks up the corresponding tag in the Virtual SCSI request list, and because this corresponding tag does not exist, the emulator drops the reply therefore emulating a SCSI abort command.

The process is shown below from box labelled “710” to “720” and finishing at “730“.

US07865663-20110104-D00007

In the patent, the above process is summed up nicely in the following paragraph.

Accordingly, a faithful emulation of SCSI aborts and resets, where the guest OS has total control over which commands are aborted and retried can be achieved by keeping a virtual SCSI request list of outstanding requests that have been sent to the NFS server. When the response to a request comes back, an attempt is made to find a matching request in the virtual SCSI request list. If successful, the matching request is removed from the list and the result of the response is returned to the virtual machine. If a matching request is not found in the virtual SCSI request list, the results are thrown away, dropped, ignored or the like.

So there we have it, that is how VMware’s patented SCSI Protocol emulation allows SCSI commands not supported natively by NFS to be honoured, therefore allowing applications dependant on Block based storage to be ran successfully within a VM where its VMDK is backed by NFS storage.

Let’s recap what we have learned so far.

1. The SCSI Commands, abort & reset have no equivalent in the NFS protocol.
2. The VMware SCSI Emulation process handles SCSI commands not supported natively by NFS thanks to the Virtual SCSI Request List.
3. Guest Operating Systems and Applications running in Virtual Machines on ESXi issue native SCSI commands to the NTFS volume, which is presented to the VM via a VMDK and housed on an NFS datastore.
4. The underlying NFS protocol is not exposed to the Guest OS, Application/s or Virtual Machine.
5. The SCSI Commands, abort & reset are emulated by the hyper visor through removing these requests from the Virtual SCSI emulation list.

In part two, I will discuss Forced Unit Access (FUA) & Write Through.

Integrity of Write I/O for VMs on NFS Datastores Series

Part 1 – Emulation of the SCSI Protocol
Part 2 – Forced Unit Access (FUA) & Write Through
Part 3 – Write Ordering
Part 4 – Torn Writes
Part 5 – Data Corruption

Nutanix Specific Articles

Part 6 – Emulation of the SCSI Protocol (Coming soon)
Part 7 – Forced Unit Access (FUA) & Write Through (Coming soon)
Part 8 – Write Ordering (Coming soon)
Part 9 – Torn I/O Protection (Coming soon)
Part 10 – Data Corruption (Coming soon)

Related Articles

1. What does Exchange running in a VMDK on NFS datastore look like to the Guest OS?
2. Support for Exchange Databases running within VMDKs on NFS datastores (TechNet)
3. Microsoft Exchange Improvements Suggestions Forum – Exchange on NFS/SMB
4. Virtualizing Exchange on vSphere with NFS backed storage?