Nutanix Resiliency – Part 7 – Read & Write I/O during Hypervisor upgrades

If you haven’t already review Parts 1 through 4, please do so as they cover critical resiliency factors around speed of recovery from failures, increasing resiliency on the fly by converting from RF2 to RF3 as well as using Erasure Coding (EC-X) to save capacity while providing the same resiliency level.

Parts 5 and 6 covered how Read and Write I/O function during CVM maintenance or failure and in this Part 7 of the series, we look at what impact hypervisor (ESXi, Hyper-V, XenServer and AHV) upgrades have on Read and Write I/O.

This post will refer to Parts 5 and 6 heavily so they are required reading to fully understand this post.

As covered in Part 5 and 6, no matter what the situation with the CVM, read and write I/O continues to be served and data remains in compliance with the configured Resiliency Factor.

In the event of a hypervisor upgrade, Virtual Machine are first migrated off the node and continue normal operations. In the event of a hypervisor failure the Virtual Machine would be restarted by HA and then resume normal operations.

Whether it be a hypervisor (or node) failure or hypervisor upgrade, ultimately both scenarios result in the VM/s running on a new node and the original node (Node 1 in the diagram below) being offline with the data on it’s local drives unavailable for a period of time.

HostFailureHypervisorUpgradeWriteIO

Now how does Read I/O work in this scenario? The same way as was described in Part 5 with reads being serviced remotely OR if the 2nd replica happens to be on the node the Virtual Machine migrated (or was restarted by HA) onto, then the read is serviced locally. If a remote read occurs the 1MB extent is localised to ensure subsequent reads are local.

How about Write I/O? Again as per Part 6, all writes are always in compliance with the configured Resiliency Factor no matter if it’s a hypervisor upgrade OR CVM, Hypervisor, Node, Network, Disk or SSD failure with one replica being written to the local node and the subsequent one or two replica/s distributed throughout the cluster based on the clusters current performance and capacity per node.

It really is that simple, and this level of resiliency is achieved only thanks to the Acropolis Distributed Storage Fabric.

Summary:

  1. A hypervisor failure never impacts the write path of ADSF
  2. Data integrity is ALWAYS maintained even in the event of a hypervisor (node) failure
  3. A hypervisor upgrade is completed without disruption to the read/write path
  4. Reads continue to be served either locally or remotely regardless of upgrades, maintenance or failure
  5. During hypervisor failures, Data Locality is maintained with writes always keeping one copy locally where the VM resides for optimal read/write performance during upgrades/failure scenarios.

Index:
Part 1 – Node failure rebuild performance
Part 2 – Converting from RF2 to RF3
Part 3 – Node failure rebuild performance with RF3
Part 4 – Converting RF3 to Erasure Coding (EC-X)
Part 5 – Read I/O during CVM maintenance or failures
Part 6 – Write I/O during CVM maintenance or failures
Part 7 – Read & Write I/O during Hypervisor upgrades
Part 8 – Node failure rebuild performance with RF3 & Erasure Coding (EC-X)
Part 9 – Self healing
Part 10: Nutanix Resiliency – Part 10 – Disk Scrubbing / Checksums

Nutanix X-Ray Benchmarking tool – Introduction

I’ve been excited to write about X-ray for a while now, but I’ve not had the time. But the opportunity has presented itself where I could kill two birds with one stone and do some performance comparisons between Nutanix AHV Turbo Mode and other platforms on the same underlying hardware, so what better time to review X-ray as part of this process.

So for those of you who have not heard of X-Ray, it wouldn’t be unreasonable to assume it’s just another benchmarking tool to further muddy the waters when comparing different platforms.

However X-Ray takes a different approach, to quote Paul Updike who is part of Nutanix Technical Marketing Engineering:

Normally performance is your test variable and you measure the effect on the system. X-ray is upside down, performance of an app in a VM is the control and our test variable is the system. We measure the effect on the control.

So if all you want is “hero numbers” you’ve come to the wrong place, although  X-Ray does have a peak performance micro-benchmark test built-in, it’s far from real world in comparison to the other tests within X-ray.

The X-Ray virtual appliance is recommended to be ran on a cluster which is not the target for the testing, such as a management cluster. But for those environments where this additional hardware may not be available, it can also be deployed on VirtualBox or VMware Workstation on your PC or laptop.

Also if you have an Intel NUC, you could deploy Nutanix Community Edition (CE) and run X-Ray on CE which is based on AHV.

In addition to the different approach X-ray takes to benchmarking, I like that X-ray performs fully automated testing across multiple hypervisors including ESXi, AHV as well as different underlying storage. This helps ensure consistent and fair comparisons between platforms, or even comparisons between Nutanix node types if you decide to compare model types before making a purchasing decision.

X-ray has several built in tests which are focused not just on outright performance, but on how a system functions and performs during node failure/s, with snapshots as well as during rolling upgrades.

The reason Nutanix took this approach is because it is much more real world than simply firing up I/O meter with lots of outstanding I/O with a 100% random 4k read. In the real world, customers performance upgrades (hopefully regularly to take advantage of new functionality and performance!), hardware does fail when we can least afford it and using space efficient snapshots as part of an overall backup strategy makes a lot of sense.

Now let’s take a look at the X-Ray interface starting with an overview:

XrayOverview

X-Ray is designed to be similar to PRISM to keep that great Nutanix look and feel. The tool is very simple to use with three sections being Tests, Analyses and Targets.

To get started is very quick/easy, just open the “Targets” view (shown below) and select “New Target”.

XrayTargets

In the “Create Target” popup, you simply, provide a name for the target e.g.: “Nutanix NX-3460 Cluster AHV”, select the Manager type, being either vCenter for ESXi environments or PRISM for AHV.

Then select the cluster type, being Nutanix (i.e.: A Nutanix NX, Dell XC, Lenovo HX or HPE/Cisco software only) OR “Non-Nutanix” which is for comparisons with platforms not running Nutanix AOS such as VMware vSAN.

XrayCreateTarget

For VMware environments, you then provide the vCenter details and regardless of the hardware type or platform, you supply the out of band management (e.g.: IPMI) details. The out of band management details allow X-ray to perform simulated hardware failure tests which are critical to any product evaluation and pre-production operational verification testing.

X-Ray then allows you to select the cluster, container (or datastore) and networking (e.g.: Port Group) to be used for the testing.

XrayCreateTarget_Cluster

X-ray then discovers the nodes (e.g.: ESXi Hosts) and allows you to add nodes and confirm the IPMI type to ensure maximum compatibility.

XrayCreateTarget_Node

Now hit “Save” and you’re good to go! Pretty simple right?

Now to run a test, simply click the test you want to run and select “Add to Queue”.

Xray_RunTestVDISim

The beauty of this is X-ray allows you to queue as many tests as you want and leave the system to run the tests, say overnight or over a weekend without requiring you to monitor them and start tests one by one.

In between tests the target systems are cleaned up (i.e.: data and VMs deleted) to ensure consistent / fair results even when running test packages one after another.

Once a test has been ran, you can view the results in the X-Ray GUI (as shown below):

XrayTestsOverview

You can also generate a PDF report for individual tests or perform analysis between two tests including of different platforms:

XrayAnalyses

The above results show and overlay between two platforms, the first being AHV (although it’s incorrectly named Turbo mode when it was ran using non Turbo mode AOS version 5.1.1). As we can see, AHV even without turbo mode was more consistent than the other platform.

To create a PDF report, simply use the “Actions” drop down menu and select “Create Report”.XrayCreateReport

The report will create a report which covers off details about X-ray, the Target cluster/s, the scenario being tested and the test results.

XrayTOCReport

It will show simple results such as if the test passed (i.e.: Completed the required tasks) and things like test duration as shown below:

XrayReportTargetOverview

X-Ray also provides built-in tests for mixed workloads, which is much more realistic than testing peak performance for point (or siloed) solutions which are become more and more rare these days. XrayMixedWorkloads

X-Ray’s built in tests are also auto scaling based on the cluster size of the target and allow tuning of the scenario. For example, in the VDI simulator scenario, Task, Knowledge or Power Users can be selected.

XRayVDISimulator
Summary:

X-Ray provides a tool which is free of charge, multi-hypervisor, multi-platform (including non-HCI) which is easy to use for proof of concepts, product comparisons as well as real world, operational verification.

I am working with the X-ray team to develop new built in test scenarios to simulate real world scenarios for business critical applications as well as to allow customers and 3rd parties to validate the benefits of functionality such as data locality.

The following is a series of posts covering Nutanix AHV Turbo Mode performance/functionality comparisons with other products.

Nutanix X-Ray Benchmarking tool Part 2 -Snapshot Impact Scenario

Nutanix X-Ray Benchmarking tool Part 3 – Extended Node Failure Scenario

NOS & Hypervisor Upgrade Resiliency in PRISM

I have had several prospective and existing customers say how much they like the One Click upgrade PRISM provides for NOS, Hypervisor’s, Firmware and NCC. These customers typically also ask questions about what happens if they perform a One Click upgrade and the cluster is for any reason degraded such as from a drive, node, block failure.

Before starting a One Click upgrade, NOS always performs Pre-Upgrade checks to ensure the cluster is healthy. In the event the cluster is not fully resilient the upgrade process will be aborted as shown below:

AcropolisUpgrade

 

In the above case, the cause of the cluster being “under-replicated” (meaning the configured Resiliency Factor of 2 or 3 was not in compliance) was due to the fact NOS had just be upgraded on the cluster and one of the nodes had not yet come back online when the One Click Upgrade for the Acropolis hypervisor (AHV) was started.

Other situations where the cluster may be under replication is following a HDD, SSD, Node or Block failure. In all these cases, the Nutanix Distributed File System (NDFS) will restore resiliency assuming sufficient rebuilt capacity is available in the Storage Pool. This is why Nutanix always recommends clusters be designed with at least N+1 available capacity to ensure rebuild capacity exists and the cluster can automatically self heal.

As a general rule it is recommended to wait for approx 10 mins between NOS and Hypervisor upgrades to avoid these kind of issues, or you can simply check the Home screen of PRISM and ensure the Heath status is Good as shown below:

HealthGood

and that the Data Resiliency Status is “OK” as shown below.DataResiliencyOk

Both the Health and Data Resiliency status are Hypervisor agnostic and appear on the Home screen of all Nutanix deployments.

If both the Health Status and Data Resiliency are good then you can go ahead and start the upgrade and it should complete successfully.

Summary:

PRISM will not start an upgrade of NOS or the Hypervisor if the cluster is degraded, so you can rest assured that even if you attempt an upgrade by accident when the cluster is degraded, NOS will protect you.

Related Posts:

1. Scaling Hyper-converged solutions – Compute only.

2. Acropolis Hypervisor (AHV) I/O Failover & Load Balancing

3. Advanced Storage Performance Monitoring with Nutanix

4. Nutanix – Improving Resiliency of Large Clusters with Erasure Coding (EC-X)

5. Nutanix – Erasure Coding (EC-X) Deep Dive

6. Acropolis: VM High Availability (HA)

7. Acropolis: Scalability