Expanding Capacity on a Nutanix environment – Design Decisions

I recently saw an article about design decisions around expanding capacity for a HCI platform which went through the various considerations and made some recommendations on how to proceed in different situations.

While reading the article, it really made me think how much simpler this process is with Nutanix and how these types of areas are commonly overlooked when choosing a platform.

Let’s start with a few basics:

The Nutanix Acropolis Distributed Storage Fabric (ADSF) is made up of all the drives (SSD/SAS/SATA etc) in all nodes in the cluster. Data is written locally where the VM performing the write resides and replica’s are distributed based on numerous factors throughout the cluster. i.e.: No Pairing, HA pairs, preferred nodes etc.

In the event of a drive failure, regardless of what drive (SSD,SAS,SATA) fails, only that drive is impacted, not a disk group or RAID pack.

This is key as it limited the impact of the failure.

It is importaint to note, ADSF does not store large objects nor does the file system require tuning to stripe data across multiple drives/nodes. ADSF by default distributes the data (at a 1MB granularity) in the most efficient manner throughout the cluster while maintaining the hottest data locally to ensure the lowest overheads and highest performance read I/O.

Let’s go through a few scenarios, which apply to both All Flash and Hybrid environments.

  1. Expanding capacityWhen adding a node or nodes to an existing cluster, without moving any VMs, changing any configuration or making any design decisions, ADSF will proactively send replicas from write I/O to all nodes within the cluster, therefore improving performance while reactively performing disk balancing where a significant imbalance exists within a cluster.

    This might sound odd but with other HCI products new nodes are not used unless you change the stripe configuration or create new objects e.g.: VMDKs which means you can have lots of spare capacity in your cluster, but still experience an out of space condition.

    This is a great example of why ADSF has a major advantage especially when considering environments with large IO and/or capacity requirements.

    The node addition process only requires the administrator to enter the IP addresses and its basically a one click, capacity is available immediately and there is no mass movement of data. There is also no need to move data off and recreate disk groups or similar as these legacy concepts & complexities do not exist in ADSF.

    Nutanix is also the only platform to allow expanding of capacity via Storage Only nodes and supports VMs which have larger capacity requirements than a single node can provide. Both are supported out of the box with zero configuration required.

    Interestingly, adding storage only nodes also increases performance, resiliency for the entire cluster as well as the management stack including PRISM.

  2. Impact & implications to data reduction of adding new nodesWith ADSF, there are no considerations or implications. Data reduction is truely global throughout the cluster and regardless of hypervisor or if you’re adding Compute+Storage or Storage Only nodes, the benefits particularly of deduplication continue to benefit the environment.

    The net effect of adding more nodes is better performance, higher resiliency, faster rebuilds from drive/node failures and again with global deduplication, a higher chance of duplicate data being found and not stored unnecessarily on physical storage resulting in a better deduplication ratio.

    No matter what size node/s are added & no matter what Hypervisor, the benefits from data reduction features such as deduplication and compression work at a global level.

    What about Erasure Coding? Nutanix EC-X creates the most efficient stripe based on the cluster size, so if you start with a small 4 node cluster your stripe would be 2+1 and if you expand the cluster to 5 nodes, the stripe will automatically become 3+1 and if you expand further to 6 nodes or more, the stripe will become 4+1 which is currently the largest stripe supported.

  3. Drive FailuresIn the event of a drive failure (SSD/SAS or SATA) as mentioned earlier, only that drive is impacted. Therefore to restore resiliency, only the data on that drive needs to be repaired as opposed to something like an entire disk group being marked as offline.

    It’s crazy to think a single commodity drive failure in a HCI product could bring down an entire group of drives, causing a significant impact to the environment.

    With Nutanix, a rebuild is performed in a distributed manner throughout all nodes in the cluster, so the larger the cluster, the lower the per node impact and the faster the configured resiliency factor is restored to a fully resilient state.

At this point you’re probably asking, Are there any decisions to make?

When adding any node, compute+storage or storage only, ensure you consider what the impact of a failure of that node will be.

For example, if you add one 15TB storage only node to a cluster of nodes which are only 2TB usable, then you would need to ensure 15TB of available space to allow the cluster to fully self heal from the loss of the 15TB node. As such, I recommend ensuring your N+1 (or N+2) node/s are equal to the size of the largest node in the cluster from both a capacity, performance and CPU/RAM perspective.

So if your biggest node is an NX-8150 with 44c / 512GB RAM and 20TB usable, you should have an N+1 node of the same size to cover the worst case failure scenario of an NX-8150 failing OR have the equivalent available resources available within the cluster.

By following this one, simple rule, your cluster will always be able to fully self heal in the event of a failure and VMs will failover and be able to perform at comparable levels to before the failure.

Simple as that! No RAID, Disk group, deduplication, compression, failure, or rebuild considerations to worry about.

Summary:

The above are just a few examples of the advantages the Nutanix ADSF provides compared to other HCI products. The operational and architectural complexity of other products can lead to additional risk, inefficient use of infrastructure, misconfiguration and ultimately an environment which does not deliver the business outcome it was originally design to.

What’s .NEXT 2016 – All Flash Everywhere!

I am pleased to say Nutanix and our OEMs are now offering even more flexibility with our “Configure To Order” option (a.k.a CTO) by allowing any node type, yes ANY node type to be configured with all flash.

Why is this so cool, well Nutanix and our OEMs (Dell XC & Lenovo HX) have a wide range of models which customers can choose from and for customers who require large usable capacity of high performance storage, this is a simple way to get a pre-certified solution with all the flexibility of build your own without the risks.

AllFlashEverywhere

With this increased level of flexibility, the argument for BYO/HCL is all but moot in my opinion.

So let’s think about what this means.

The NX-8150, a 1 node per 2RU product (which I was heavily involved in the design of) will now support 24 x SSDs!

Even with the currently supported SSDs (1.92TB each), this would mean >46TB of RAW SSD capacity along with dual Broadwell CPUs and up to 768GB RAM.

Note: Higher capacity SSDs are coming soon to provide even more capacity!

Now with 24 x SSDs that is some serious power!

What’s also exciting is this doesn’t just mean higher flash capacity, it also means higher performance. This is because Nutanix persistent write buffer (OpLog) is striped across all SSDs in a node, this means the write performance can benefit from all SSDs in the node, in the case of that’s NX8150 that’s 24 drives!

Combine this with the fact Nutanix now supports any node as storage only, and this gives customers near unlimited flexibility without the risk/complexity of BYO/HCL options.

After all, the hardware is commodity, all the value is in the software so who cares what HW it runs on as long as its reliable.

Summary:

  • Configure to Order (CTO) now allows any node type to be configured with All Flash
  • All Flash nodes can also be Storage Only nodes
  • Write Performance takes advantage of all SSDs in a node
  • Nutanix Configure to Order (CTO) option makes the argument for BYO/HCL options all but moot.

Related .NEXT 2016 Posts

Nutanix Acropolis Hypervisor (AHV) certified for 30k Microsoft Exchange Mailboxes

Last year Nutanix announced we had successfully completed Microsoft Exchange Solution Review Program (ESRP) certification for Hyper-V, now I am pleased to announce we have continued our focus on giving customers choice to deploy business critical applications on any hypervisor, and have now achieved ESRP for our Acropolis Hypervisor (AHV).

I believe Acropolis Hypervisor (AHV) and the Nutanix platform is a great choice for business critical applications such as MS Exchange as it gives all the benefits of virtualization, without the complexity of legacy hypervisors and management platforms.

For more information on the advantages of AHV specifically for MS Exchange see:  MS Exchange on Nutanix Acropolis Hypervisor (AHV).

The Nutanix listing on the Microsoft Exchange Solution Review Program can be found at the following URL for both Hyper-V and AHV.

Exchange Solution Reviewed Program (ESRP) – Storage

The Nutanix Best Practice guide for MS Exchange on AHV is also due for release shortly, so stay tuned!

Related Articles:

1. Think HCI is not an ideal way to run your mission-critical x86 workloads? Think again!

2. Jetstress Testing with Intelligent Tiered Storage Platforms

3. Microsoft Exchange 2013/2016 Jetstress Performance Testing on Nutanix Acropolis Hypervisor (AHV)

4, Peak performance vs Real World – Exchange on Nutanix Acropolis Hypervisor (AHV)