Nutanix AOS 5.5 delivers 1M read IOPS from a single VM, but what about 70/30 read/write?

I recently wrote Nutanix AOS 5.5 delivers 1M IOPS from a single VM, but what happens when you vMotion which showed the impact of a vMotion was around -10% for a period of approx. 3 seconds before read performance resumed back to pre-migration levels.

In this post I will be addressing the question about performance for a single VM with a more realistic 70% read, 30% write IO profile which was performed using an 8k IO size and what the impact is during and after a live migration.

While not surprising to Nutanix customers, this result shows a maximum starting baseline of 436K random read and 187k random write IOPS and immediately following the migration performance reduced to 359k read and 164k write IOPS before achieving greater performance than the original baseline @ 446k read and 192k IOPS within a few seconds.

So in comparison to 100% random read which achieved just over 1 million 8k IOPS, the 70/30 mix achieves in the ballpark of 600k IOPS which is very respectable. Not bad for a platform which Nutanix competitors continue to describe as only being good for VDI. Considering even the largest array from a leading all flash SAN vendor is only advertising performance in the hundreds of thousand random read range, it shows Nutanix unique hyper-converged architecture can achieve higher performance than a monolithic all flash array from a single VM.

This shows that with the unique Nutanix Acropolis Distributed Storage Fabric, very high performance at low latency can be achieved with real world IO patterns even during and after live migrating the virtual machine across a distributed platform.

This result is further evidence of the efficiency of Nutanix Acropolis Hypervisor, AHV (which is included at no additional charge with AOS) as well as the IO path running in user space (not the much hyped in-kernel). This is in part thanks to AHV Turbo Mode which optimised the IO path which was announced at .NEXT 2017 in Washington. In addition to these excellent levels of performance, they can be sustained even when using data protection features such as snapshots as shown in recent post I wrote about Nutanix X-ray tool where I used the Snapshot impact scenario to compare Nutanix AHV and a leading hypervisor and SDS product. If you don’t have time to read the post, in short, the Nutanix competitors performance degraded as snapshots were taken while Nutanix AHV’s performance remained consistent which is essential for real world scenarios, especially with business critical applications.

With Nutanix unique ability to scale out performance using storage only nodes, even higher performance can be achieved without modification to the virtual machine to applications which gives Nutanix further advantage over the competition.

Nutanix data locality ensures optimal performance by ensuring new data is always local to the VM and cold data can remain remote indefinitely while only hot data will be migrated locally if/when required at a 1MB granularity. This translates to intelligent data locality and not brute force locality as it is frequently mistaken to be.

Back to Part 1

Heterogeneous Nutanix Clusters Advantages & Considerations

Lets start with a simple example, the below shows a 4 node cluster mixing 2 x NX-3060 nodes with 2 x NX-8035 nodes. Both node types share the same Haswell CPU types but the NX-3060 has ~2TB usable and the NX-8035 has ~8TB usable.

3060and8035Mixed

Assuming the cluster capacity was 50% utilized the NDSF layer would look similar to this:

3060and8035cluster50percentused

The above shows the NDSF having a total Storage Pool capacity of 20TB with 50% used (10TB). As we have a heterogenous cluster, we have 2 different node types with vastly different usable capacity.

Nutanix Disk Balancing automatically balances the storage to ensure the utilization percentage of all SSDs/HDDs within the cluster are within +-15%. This means administrators do not have to worry about capacity management on a per node basis, capacity management only needs to be performed at the storage pool (cluster) layer.

Advantage 1: No silos of storage capacity is heterogeneous environments

Advantage 2: NDSF disk balancing ensures the data is evenly distributed throughout the cluster

Advantage 3: There is no requirement for hypervisor level storage capacity management such as Storage DRS (SDRS).

For more information on why Storage DRS is not required see: Storage DRS and Nutanix – To use, or not to use, that is the question?

In a heterogeneous environment, it is likely you will have multiple workloads with different capacity and performance requirements. The below diagram shows the same 4 node cluster, with a single storage pool and 4 containers with different data protection and reduction settings to suit a wide range of application requirements.

Note: The RF3 container shown below would only be possible in clusters of 5 nodes or more, but is shown to illustrate the flexibility/capabilities of NDSF.

HetroClusterCapacity

The storage pool itself has up to 20TB usable (assuming RF2 and excluding data reduction savings). In the Pool we can see four Containers which can be thought of as policies which can be applied to Virtual Machines or Virtual Disks.

Container01 is configured with RF2 and In-Line compression and reports 10TB free space as the underlying storage pool (where capacity is managed) is 50% utilised. Therefore the Container reports free space as all the available capacity within the Storage Pool based on its configured RF.

Container02 has RF2, In-Line compression and EC-X enabled but you will note it also reports 10Tb free space, as capacity is not assigned to a container, its shared between all containers within a Storage Pool.

Container03 is configured with a RF3 which is different to Containers 01 and 02, as such the container reports free space based on its configured RF of 3, so it shows 13.3TB usable and 6.66Tb free space as that is the maximum data that can be supported in that container based on its storage policies.

Container04 reports the same free space as Container 01 and 02, as its configured with the same RF. While Container04 has all data reduction technologies enabled, the Container reports actual free space, as data reduction takes effect the usable capacity will change.

It is possible to set capacity reservations on Containers where an application or tenant requires a guarantee as to the usable capacity available, it is also possible to set limits on containers to prevent workloads using more than a specified amount of capacity. However, for most use cases, I recommend not using Reservations or Limits and simply manage capacity at the Storage Pool layer.

Nutanix also supports VMs with more assigned/used capacity than the node they are running on, for more information see: What if my VMs storage exceeds the capacity of a Nutanix node?

Regardless of what node type/s reside within a Nutanix cluster, there is no advanced settings required to be configured such as Queue Depths, VAAI and multi-pathing, which can be required when mixing legacy storage platforms in the same cluster. There is also no requirement for Storage DRS to manage either performance or capacity as discussed earlier.

Advantage 3: No silos of storage capacity, all capacity is shared in the storage pool

Advantage 4: Storage policies such as RF and Data Reduction can be changed on the fly as required and multiple policies are supported within the same cluster.

For more information about Nutanix data reduction technologies, see: Nutanix Implementation of Data Avoidance & Reduction Technologies

Regardless of the mixture of node types and their respective capacity/performance characteristics, there is no advanced configuration required to achieve optimal performance.

Nutanix automatically manages I/O pathing and as data locality ensures most data is read locally and writes are always written local to the VM and then replicas distributed throughout the cluster, it minimizes the chances of hot spots by default.

In the unlikely event one nodes local SSD tier becomes saturated, NDSF will automatically write data across the shared SSD tier until the local nodes SSD tier has sufficient capacity to resume local writes. This avoids the requirement for a storage admin to take any corrective actions.

Advantage 5: In the unlikely event of saturation of a nodes SSD tier, NDSF automatically redirects new I/O until ILM (tiering) can free up capacity within the local tier.

NDSF natively distributes writes throughout all nodes within the cluster. This means all nodes within heterogeneous clusters increase the capacity, performance and resiliency of the entire cluster.

To increase the performance of a single VM, you have numerous options. All you need to do is migrate (vMotion for ESXi, Live Migration for Hyper-V or Migrate for AHV) to a node with higher spec physical processors, more SSD drives and/or more SATA spindles.

There is no requirement to Storage vMotion, or relocate the VM to a new Datastore/Container. NDSF manages the storage layer automatically and will localize hot data if/when required.

Advantage 6: No silos of storage capacity, all capacity is shared in the storage pool

Advantage 7: All nodes contribute to the capacity, performance and resiliency of the cluster

Heterogeneous clusters are managed by a single HTML 5 GUI called PRISM. There is no need to access multiple management interfaces for different storage types.

Advantage 8: Heterogeneous clusters are managed via a single HTML 5 GUI.

Nutanix also supports Pin to SSD which allows workloads requiring all flash to reside within a hybrid (SSD+SATA) cluster and be guaranteed all flash performance.

VMs or Virtual Disks can also be marked to be stored solely in Flash on the fly if/when required and vice versa.

Advantage 9: No silos required for workloads requiring All Flash performance

Nutanix eliminates the complexity around managing performance at a datastore layer. Nutanix supports up to the chosen hypervisors limits, e.g.: vSphere HA limit is 2048 VMs per datastore. As all controllers within a cluster actively service all datastores (Containers), performance isn’t constrained at a datastore layer like with traditional storage products.

For more information see: Unlimited VMs per datastore? Its not a myth with Nutanix!

Advantage 10: No performance concerns/constraints at the datastore level

What about Considerations for Heterogeneous Clusters?

From a performance perspective, always ensure you size to have your N+x (e.g.: N+1 , N+2 etc) node/s sized >= the largest node in the cluster to ensure in the event of a node failure, workloads benefiting from higher performance nodes can failover to equivalent nodes.

From a capacity perspective, for NDSF to be able to restore the configured RF (RF2 or RF3) in the event of a node failure, sufficient capacity must exist within the storage pool. As such, when using high capacity nodes such as NX-8035s , NX-8150s or NX-6035C storage only nodes, ensure you have >= capacity of the largest node free within the storage pool.

Advantage 11: Performance and availability sizing for heterogeneous clusters is simple.

Another consideration is for mission-critical or high I/O applications, spread these evenly across the nodes and ideally ensure the active working set fits within the local SSD tier. Doing so will maximise performance, but in the event a very large workload cannot fit with the local SSD, its data will resided within the shared SSD tier and be actively serviced by multiple Controller VMs.

For more information about sizing see:  Rule of Thumb: Sizing for Storage Performance in the new world.

Advantage 12: The NDSF shared SSD tier ensures in the event a workload exceeds the local SSD capacity that the application still enjoys all flash performance by distributing data intelligently across the cluster.

Over time, when adding new nodes, VMs can be quickly/easily migrated to newer, higher performance/capacity nodes without any preparation. The VMs will immediately benefit from the newer nodes CPU,RAM and storage performance even if most of its data is still stored on older node types.

Older nodes can be non disruptively removed once they are end of life, again without any preparation or administrator intevenston.

Advantage 13: Workloads on NDSF benefit from newer generation nodes immediately without complex design/migration activities.

Summary:

  • Nutanix supports and recommends heterogeneous clusters
  • No complexity with multi-pathing, it’s optimal out of the box
  • No custom per datastore configuration
  • VAAI just works, no advanced configuration required due to mixed node types
  • No compromise required to mix node types
  • No silos of storage capacity, all capacity is shared in the storage pool
  • All nodes contribute to performance of the cluster
  • No balancing VMs across datastores/storage devices is required to improve performance/resiliency
  • NDSF disk balancing ensures the data is evenly distributed throughout the cluster helping avoid hotspots
  • The distribution of RF traffic throughout the cluster also helps avoid hotspots
  • No silos required for workloads requiring all flash performance
  • NDSF ensures VMs can immediately benefit from the addition of newer generation node types
  • Nodes can be added/removed without system administrator performing data migrations

Enterprise Architecture & Avoiding tunnel vision.

Recently I have read a number of articles and had several conversations with architects and engineers across various specialities in the industry and I’m finding there is a growing trend of SMEs (Subject Matter Experts) having tunnel vision when it comes to architecting solutions for their customers.

What I mean by “Tunnel Vision” is that the architect only looks at what is right in front of him/her (e.g.: The current task/project) , and does not consider the implications of how the decisions being made for this task may impact the wider I.T infrastructure and customer from a commercial / operational perspective.

In my previous role I saw this all to often, and it was frustrating to know the solutions being designed and delivered to the customers were in some cases quite well designed when considered in isolation, but when taking into account the “Big Picture” (or what I would describe as the customers overall requirements) the solutions were adding unnecessary complexity, adding risk and increasing costs, when new solutions should be doing the exact opposite.

Lets start with an example;

Customer “ACME” need an enterprise messaging solution and have chosen Microsoft Exchange 2013 and have a requirement that there be no single points of failure in the environment.

Customer engages an Exchange SME who looks at the requirements for Exchange, he then points to a vendor best practice or reference architecture document and says “We’ll deploy Exchange on physical hardware, with JBOD & no shared storage and use Exchange Database Availability Groups for HA.”

The SME then attempts to justify his recommendation with “because its Microsoft’s Best practice” which most people still seem to blindly accept, but this is a story for another post.

In fairness to the SME, in isolation the decision/recommendation meets the customers messaging requirements, so what’s the problem?

If the customers had no existing I.T and the messaging system was going to be the only I.T infrastructure and they had no plans to run any other workloads, I would say the solution proposed could be a excellent solution, but how many customers only run messaging? In my experience, none.

So lets consider the customer has an existing Virtual environment, running Test/Dev, Production and Business Critical applications and adheres to a “Virtual First” policy.

The customer has already invested in virtualization & some form of shared storage (SAN/NAS/Web Scale) and has operational procedures and expertises in supporting and maintaining this environment.

If we were to add a new “silo” of physical servers, there are many disadvantages to the customer including but not limited too;

1. Additional operational documentation for new Physical environment.

2. New Backup & Disaster Recovery strategy / documentation.

3. Additional complexity managing / supporting a new Silo of infrastructure.

4. Reduced flexibility / scalability with physical servers vs virtual machines.

5. Increased downtime and/or impact in the event hardware failures.

6. Increased CAPEX due to having to size for future requirements due to scaling challenges with physical servers.

So what am I getting at?

The cost of deploying the MS Exchange solution on physical hardware could potentially be cheaper (CAPEX) Day 1 than virtualizing the new workload on the existing infrastructure (which likely needs to be scaled e.g.: Disk Shelves / Nodes) BUT would likely result overall higher TCO (Total Cost of Ownership) due to increased complexity & operational costs due to the creation of a new silo of resources.

Both a physical or virtual solution would likely meet/exceed the customers basic requirement to serve MS Exchange, but may have vastly different results in terms of the big picture.

Another example would be a customer has a legacy SAN which needs to be replaced and is causing issues for a large portion of the customers workloads, but the project being proposed is only to address the new Enterprise messaging requirements. In my opinion a good architect should consider the big picture and try to identify where projects can be combined (or a projects scope increased) to ensure a more cost effective yet better overall result for the customer.

If the architect only looked at Exchange and went Physical Servers w/ JBOD, there is zero chance of improvement for the rest of the infrastructure and the physical equipment for Exchange would likely be oversized and underutilized.

It will in many cases be much more economical to combine two or more projects, to enable the purchase of a new technology or infrastructure components and consolidate the workloads onto shared infrastructure rather than building two or more silo’s which add complexity to the environment, and will likely result in underutilized infrastructure and a solution which is inferior to what could have been achieved by combining the projects.

In conclusion, I hope that after reading this article, the next time you or your customers embark on a new project, that you as the Architect, Project Manager, or Engineer consider the big picture and not just the new requirement and ensure your customer/s get the best technical and business outcomes and avoid where possible the use of silos.