Write I/O Performance & High Availability in a scale-out Distributed File System

Following on from my recent post titled “Data Locality & Why is important for vSphere DRS clusters” I would like to discuss at a high level how Write I/O works in the Nutanix Distributed File System, how the solution ensures high availability in the event of a node failure and what impact a failure has on performance.

Lets start with a typical Write operation.

The below diagram shows a three (3) node Nutanix cluster with a Guest VM starting to perform write I/O, this is represented in a simplistic manor by the three (3) Diamonds (Red, Yellow and Purple)

NutanixWriteIOstart

The write I/O is written to the local SSD tier (as is every Write in a Nutanix environment) as shown below.

NutanixWriteDataWrittenLocal

Before acknowledging the write the Nutanix Controller VM (CVM) then replicates a copy of the data across the Nutanix Distributed File System.

The below diagram illustrates what this looks like in a three node cluster.

NutanixWriteSyncToOtherNodes

Once the data in successfully written to other nodes within the cluster, the Write acknowledgement is given. This ensures data is consistent and always protected.

In a Nutanix cluster, as Controllers (Nutanix CVMs) are scaled linearly with the ESXi hosts, Write I/O is then spread over more controllers, reducing the chance of contention in the environment at both a storage controller and network layer as each controller shares 2 x 10Gb connections per node.

In the event of a node failure, in a vSphere cluster, HA will restart the failed VM/s onto a surviving node in the cluster.

The VM will start-up and operate as normal and where data is not local to the node (as discussed in detail in my post  “Data Locality & Why is important for vSphere DRS clusters“) the data will initially be accessed over 10Gb before being replicated locally for future reads.

NutanixHAAfterWithDataAccess

All future writes for the VM/s which have been restarted by HA on different nodes will perform at a similar rate (if not the same rate) as they did before the failure depending on how many nodes are in the cluster. Where the Network is not a bottleneck, there should be minimal/no difference in write performance after a node failure.

The Nutanix cluster will also detect a node has failed, and ensure two copies of all data are available, and in the above example where only one copy of the data exists, the cluster will replicate the required data to ensure High Availability (“Replication Factor” of 2) is maintained.

As this replication is done across multiple controllers and nodes, it is much faster and lower impact than a traditional RAID rebuild which most of us will be familiar with.

The end state of this process looks like this.

NutanixHAEndState

So in conclusion using a “scale-out” storage controller solution like Nutanix ensures consistent high write performance even immediately following a node failure by eliminating the requirement for RAID style rebuilds which are disk intensive and can lead to “Double Disk Failures” and data loss.

The replication of data being distributed across all nodes in the cluster ensures minimal impact to each Nutanix controller, ESXi host and the network while ensuring the data is re-protected as soon as possible.

Related Articles

1. Data Locality & Why is important for vSphere DRS clusters

 

Storage DRS and Nutanix – To use, or not to use, that is the question?

Storage DRS (SDRS) is an excellent feature which was released with vSphere 5.0 in late 2011. For those of you who are not familiar with SDRS I recommend reading the following article prior to reading the rest of this post as SDRS knowledge is assumed from now on.

Understanding VMware vSphere 5.1 Storage DRS

This post also assumes basic knowledge of the Nutanix platform, for those of you who are not familiar with Nutanix please review the following links prior to reading the remainder of this post.

About Nutanix | How Nutanix Works | 8 Strategies for a Modern Datacenter

Storage DRS & Nutanix – To use, or not to use, that is the question?

With Storage DRS (SDRS), both capacity and performance can be managed, but what should SDRS manage in a Nutanix environment?

Lets start with performance. SDRS can help ensure optimal performance of virtual machines by enabling the I/O metric for SDRS recommendations as shown in the screen shot below.

SDRSsettingsIOmetricCircledSmall

Once this is done, SDRS will evaluate I/O every 8 hours (by default) and where the configured latency threshold is exceeded, perform a cost/benefit analysis before deciding to make a migration recommendation or do nothing.

So the question is, does SDRS add value in a Nutanix environment from a performance perspective?

The Nutanix solution adopts the “Scale-out” methodology by having one (1) Nutanix Controller VM (CVM) per Nutanix Node (ESXi Host) and then presents NFS datastore/s to the vSphere cluster which are serviced by all CVMs. The CVMs use intelligent auto-tiering to ensure optimal performance. The way this works at a high level, is as follows.

Data is written to an SSD tier (either PCIe SSD such as Fusion-io or SATA SSD) before being migrated off to a SATA tier once the blocks are determined to be “Cold” and if/when required, promoted back the an SSD tier when they become “Hot” again for improved read performance.

As with other vendor storage solutions with auto tiering technologies (such as FAST-VP , FlashPools etc) the same recommendation around SDRS and the I/O metric is true for Nutanix, leave it disabled.

So, at this point we have concluded the I/O metric will be “Disabled”, lets move onto Capacity management.

The Nutanix solution presents large NFS datastore/s to the ESXi hosts (Nutanix nodes) which are shared across all ESXi hosts in one or more vSphere clusters.

When using SDRS, it can manage initial placement of a new Virtual machine based on the configured “Utilized Space” metric (shown below) to ensure there is not a capacity imbalance between the datastores in a datastore cluster, as well as move virtual machines around when new machines are provisioned to ensure the balance is maintained.

UtilizedSpaceSDRS

So this is a really good feature which I have and do recommend in several scenarios, however the Nutanix solution presents typical a small number of large NFS datastores to the vSphere cluster (or clusters) which are serviced by all Controller VMs (CVMs) in the Nutanix cluster. Using SDRS for initial placement does not add much (if any) value as the initial placement will almost always be on the same large NFS datastore.

Where actual physical capacity becomes an issue, space saving technologies such as compression can be enabled, or the environment can be granularly scaled by adding just a single additional Nutanix node which linearly scales the solution from both a capacity and performance perspective.

The only real choice is when you choose to present two (or more) datastores where one datastore leverage’s the Nutanix compression technology. This is a very easy scenario for a vSphere admin to choose the placement of a VM and is the same amount of administrative effort as choosing a datastore cluster which would be a collection of datastores either using compression, or not depending on the workloads.

As a result there is no advantage to using SDRS to manage utilized space.

In conclusion, Storage DRS is a great feature when used with storage arrays where performance does not scale linearly or provide intelligent tiering to address I/O bottlenecks and/or where your environment has large numbers of datastores where you need to actively manage capacity.

As performance and capacity management are intelligently managed natively by the Nutanix solution, the requirement (or benefit) provided by SDRS is negated, as a result there is no requirement or benefit for using SDRS with a Nutanix solution.

Related Articles

1. Example Architectural Decision – VMware DRS automation level for a Nutanix environment