How to successfully Virtualize MS Exchange – Part 7 – Storage Options

When virtualizing Exchange, we not only have to consider the Compute (CPU/RAM) and Network, but also the storage to provide both the capacity and IOPS required.

However before considering IOPS and capacity, we need to decide how we will provide storage for Exchange as storage can be presented to a Virtual Machine in many ways.

This post will cover the different ways storage can be presented to ESXi and used for Exchange while subsequent posts will cover in detail each of the options discussed.

First lets discuss Local Storage.

What I mean by Local Storage is SSD/HDDs within a physical ESXi hosts that is not shared (e.g.: Not accessible by other hosts).

This is probably the most basic form of storage we can present to ESXi and apart from the Hypervisor layer could be considered similar to a physical Exchange deployment.

UseLocalStorage

Next lets discuss Raw Device Mappings.

Raw Device Mappings or “RDMs” are where shared storage from a SAN is presented through the hypervisor to the guest as a native SCSI device and enables.

RDMs

For more information about Raw Device Mappings, see: About Raw Device Mappings

The next option is Presenting Storage direct to the Guest OS.

It is possible and sometime advantageous to presents SAN/NAS storage direct to the Guest OS via NFS , iSCSI or SMB 3.0 and bypasses the hyper-visor all together.

DirectInGuest

The final option we will discuss is “Datastores“.

Datastores are probably the most common way to present storage to ESXi. Datastores can be Block or File based, and presented via iSCSI , NFS or FCP (FC / FCoE) as of vSphere 5.5.

Datastores are basically just LUNs or NFS mounts. If the datastore is backed by a LUN, it will be formatted with Virtual Machine File System (VMFS) whereas NFS datastores are simply NFS 3 mounts with no formatting done by ESXi.

ViaDatastore

For more information about VMFS see: Virtual Machine File System Technical Overview.

What do all the above options have in common?

Local storage, RDMs, storage presented to the Guest OS directly and Datastores can all be protected by RAID or be JBOD deployments with no data protection at the storage layer.

Importantly, none of the four options on their own guarantee data protection or integrity, that is, prevent data loss or corruption. Protecting from data loss or corruption is a separate topic which I will cover in a non Exchange specific post.

So regardless of the way you present your storage to ESXi or the VM, how you ensure data protection and integrity needs to be considered.

In summary, there are four main ways (listed below) to present storage to ESXi which can be used for Exchange each with different considerations around Availability, Performance, Scalability, Cost , Complexity and support.

1. Local Storage (Part 8)
2. Raw Device Mappings  (Part 9)
3. Direct to the Guest OS (Part 10)
4. Datastores (Part 11)

In the next four parts, each of these storage options for MS Exchange will be discussed in detail.

Back to the Index of How to successfully Virtualize MS Exchange.

Integrity of I/O for VMs on NFS Datastores – Part 5 – Data Corruption

This is the fifth part of a series of posts covering how the Integrity of Write I/O is ensured for Virtual Machines when writing to VMDK/s (Virtual SCSI Hard Drives) running on NFS datastores presented via VMware’s ESXi hypervisor as a “Datastore”.

This part will focus on Data Corruption.

As a reminder from the first post, this post is not talking about presenting NFS direct to Windows.

So why am I covering data corruption? Simple, because there is a misconception that SCSI commands are not properly supported for VMs running on NFS datastores which leads to corruption. This was covered in Part 1, so Part 5 will focus on data corruption not specific to NFS, but which can effect all storage platforms and how it occurs, then how storage solutions can mitigate the risk of data corruption issues.

The following data is a summary of the data provided in An analysis of data corruption in the storage stack.

Netapp conducted a large scale study into data corruption, which covered >1 Million HDDs across tens of thousands of Netapp systems over 41 months (2004 – 2007) and long story short, Netapp detected a level of data corruption which surprised me and seems to disprove many things like advertised MTBF for HDDs.

The following shows a breakdown of the problems found.

netappfailureanalysis

The first thing I noticed in the above pie charts is the vast difference between the percentage of failures in Enterprise grade disks (left) and nearline based disks (right).

It also shows physical interconnects to be a large percentage of failures, which highlights the need for simplicity in the storage solution. In addition, one of the more surprising results in the level of storage protocol and performance based failures being the cause of corruption.

Note: In this study, the majority of systems deployed were FC (Block storage based) based, this highlights that a storage protocol itself regardless of being block or file based storage, can have issues if improperly implemented. So regardless of storage protocol, corruption can occur.

The below summary of corruption type and percentage of disks effected shows the dramatic 10x more issues with SATA drives compared to Enterprise grade drives.

NLvsEnterprise

The above also shows bit corruptions or Torn Writes effect more disks compared to lost or misdirected writes, which highlights the importance of Torn I/O Protection (covered in Part 4).

The article summarizes in the following points:summary

The main take away from my perspective is:

1. The requirement to have corruption handling mechanisms for any environment running workloads which require data integrity.
2. Data should be spread out (ideally across disks) to minimize the chance of issues.

The article went on to form these conclusions:

conclustion

In Summary:

1. Data corruption can occur on JBOD , enterprise grade storage solutions and everything in between.
2. SATA drives have a much higher rate (~10x) of corruption.
3. Enterprise grade drives are much better from a data integrity perspective.
4. Corruption handling via sector and ideally block based checksums is essential on writes.
5. Using a checksum on Read helps detect corrupted data.
6. Corruption can occur even when no ECC errors are reported by a physical HDD.
7. Any storage protocol implementation can have bugs which can lead to corruption.
8. Backup / Recovery solutions are essential. Reliance solely on primary storage or application level backups using disks puts your data at risk.
9. Solutions solely dependant on application level data protection on disk are at risk of corrupted data being replicated to other active/passive or backup copies.

My final point, in an enterprise grade storage solutions which use checksums to verify data integrity on write and reads, have a much lower risk of data corruption regardless of media type and storage protocol.

JBOD style deployments using SATA drives have a significantly higher risk of data corruption which is contributed to by the SATA drives 10x higher corruption rates and the lack of enterprise grade checksum features found in some shared storage (SAN/NAS) solutions.

Integrity of Write I/O for VMs on NFS Datastores Series

Part 1 – Emulation of the SCSI Protocol
Part 2 – Forced Unit Access (FUA) & Write Through
Part 3 – Write Ordering
Part 4 – Torn Writes
Part 5 – Data Corruption

Nutanix Specific Articles

Part 6 – Emulation of the SCSI Protocol (Coming soon)
Part 7 – Forced Unit Access (FUA) & Write Through (Coming soon)
Part 8 – Write Ordering (Coming soon)
Part 9 – Torn I/O Protection (Coming soon)
Part 10 – Data Corruption (Coming soon)

Related Articles

1. What does Exchange running in a VMDK on NFS datastore look like to the Guest OS?
2. Support for Exchange Databases running within VMDKs on NFS datastores (TechNet)
3. Microsoft Exchange Improvements Suggestions Forum – Exchange on NFS/SMB
4. Virtualizing Exchange on vSphere with NFS backed storage

Integrity of I/O for VMs on NFS Datastores – Part 3 – Write Ordering

This is the third part of a series of posts covering how the Integrity of I/O is ensured for Virtual Machines when writing to VMDK/s (Virtual SCSI Hard Drives) running on NFS datastores presented via VMware’s ESXi hypervisor as a “Datastore”.
As a reminder from the first post, this post is not talking about presenting NFS direct to Windows.

 

Write Ordering / Order Preservation

Another common concern when running business critical applications such as MS SQL and MS Exchange is Write Ordering and if/how this is handled by the SCSI protocol emulation process.

This requirement is described by Microsoft as:

The order of the I/O operations associated with SQL Server must be maintained. The system must maintain write ordering or it breaks the WAL protocol as described in this paper. (The log records must be written out in correct order and the log records must always be written to stable media before the data pages that the log records represent are written.) After a transaction log record is successfully flushed, the associated data page can then be flushed as well. If the subsystem allows the data page to reach stable media before the log record does, data integrity is breached.

Source: Microsoft SQL Server I/O basics.

VMware have released a Knowledge Base article specifically on this topic which states the following.

Write ordering and write-through integrity for NFS storage are both satisfied with NFS in an VMware ESX environment.
An NFS datastore, when mounted on an ESX host, goes through virtual SCSI emulation. A virtual machine disk (VMDK) file on an NFS datastore appears as a SCSI disk within the virtual machine’s guest operating system, which is no different than one residing on a VMFS volume over FCP or iSCSI protocol. Therefore, write ordering and write-through integrity are no different than those with block based storage (such as iSCSI or FCP protocol).
The above is the bulk of the article, but the full article can be found below.

Maintaining write ordering and write-through integrity using NFS in an ESX environment (KB1012143)

So as with Forced Unit Access (FUA) & Write-Through, Write Ordering is supported by VMware but even with this support, it is also a function of the underlying storage to honour the request and this process or even support may vary from storage vendor to storage vendor.

Again the point here is this process is delivered by the VMDK at the hypervisor level and passed onto the underlying storage, so regardless of the protocol being Block (iSCSI/FCP) or File based (NFS) it is the responsibility of the storage solution once the I/O is passed to it from the hypervisor.

In part four, I will discuss Torn I/O Protection.

Integrity of Write I/O for VMs on NFS Datastores Series

Part 1 – Emulation of the SCSI Protocol
Part 2 – Forced Unit Access (FUA) & Write Through
Part 3 – Write Ordering
Part 4 – Torn Writes
Part 5 – Data Corruption

Nutanix Specific Articles

Part 6 – Emulation of the SCSI Protocol (Coming soon)
Part 7 – Forced Unit Access (FUA) & Write Through (Coming soon)
Part 8 – Write Ordering (Coming soon)
Part 9 – Torn I/O Protection (Coming soon)
Part 10 – Data Corruption (Coming soon)

Related Articles

1. What does Exchange running in a VMDK on NFS datastore look like to the Guest OS?
2. Support for Exchange Databases running within VMDKs on NFS datastores (TechNet)
3. Microsoft Exchange Improvements Suggestions Forum – Exchange on NFS/SMB
4. Virtualizing Exchange on vSphere with NFS backed storage