What is the performance impact & overheads of Inline Compression on Nutanix?

I’m frequently getting asked about Nutanix data reduction capabilities such as Deduplication, Erasure Coding and Compression and one of the most common questions (especially in a competitive situation) is:

“What is the performance impact and the overhead of Inline Compression on Nutanix?”

The short answer is, the pros outweigh the cons and this has been true for as long as I can remember with the Nutanix platform.

I have been testing of various applications, node types, cluster sizes and configurations and thought I would share some data on the overheads and performance impact of in-line compression which is what Nutanix (and I) recommend for most deployments including for business critical applications such as Oracle, MS SQL and MS Exchange.

In this case I was testing storage performance for MS Exchange using Jetstress.

Now without going into the exact configuration of the environment (to avoid competitors FUD), the test was simple. I created a Windows 2012 VM and configured Jetstress. I then performed 3 x 15min runs each of which completed a database checksum at the completion.

Following the 3 runs, I enabled In-line compression and repeated the same 3 tests.

The below chart is a screenshot from the Nutanix PRISM HTML 5 UI showing the Cluster wide IOPS, latency and throughput along with the Controller VM CPU utilisation.

PerformanceSummary

As we can see, the 6 performance runs are very similar across all metrics including the CVM CPU utilisation. The below table shows each run including database read latency and log write latency which are the two key performance metrics for MS Exchange Jetstress testing.

JetstressPerfwandwocompression

Note: The performance numbers above are not the peak or best performance Nutanix can deliver, they are just one of the many test scenarios I ran.

We can see the delta between the No Compression and Inline compression is almost zero. This test shows that while we all know inline data reduction has overheads on the I/O path, that does not necessarily translate into slower performance for the application.

In this case, Nutanix in-line compression is so efficient, that customers can enjoy excellent data efficiencies for applications like MS Exchange, with virtually no impact on performance or additional CPU overheads on the CVM.

Oh and all of this performance on Acropolis Hypervisor (AHV)!

Expanding Capacity on a Nutanix environment – Design Decisions

I recently saw an article about design decisions around expanding capacity for a HCI platform which went through the various considerations and made some recommendations on how to proceed in different situations.

While reading the article, it really made me think how much simpler this process is with Nutanix and how these types of areas are commonly overlooked when choosing a platform.

Let’s start with a few basics:

The Nutanix Acropolis Distributed Storage Fabric (ADSF) is made up of all the drives (SSD/SAS/SATA etc) in all nodes in the cluster. Data is written locally where the VM performing the write resides and replica’s are distributed based on numerous factors throughout the cluster. i.e.: No Pairing, HA pairs, preferred nodes etc.

In the event of a drive failure, regardless of what drive (SSD,SAS,SATA) fails, only that drive is impacted, not a disk group or RAID pack.

This is key as it limited the impact of the failure.

It is importaint to note, ADSF does not store large objects nor does the file system require tuning to stripe data across multiple drives/nodes. ADSF by default distributes the data (at a 1MB granularity) in the most efficient manner throughout the cluster while maintaining the hottest data locally to ensure the lowest overheads and highest performance read I/O.

Let’s go through a few scenarios, which apply to both All Flash and Hybrid environments.

  1. Expanding capacityWhen adding a node or nodes to an existing cluster, without moving any VMs, changing any configuration or making any design decisions, ADSF will proactively send replicas from write I/O to all nodes within the cluster, therefore improving performance while reactively performing disk balancing where a significant imbalance exists within a cluster.

    This might sound odd but with other HCI products new nodes are not used unless you change the stripe configuration or create new objects e.g.: VMDKs which means you can have lots of spare capacity in your cluster, but still experience an out of space condition.

    This is a great example of why ADSF has a major advantage especially when considering environments with large IO and/or capacity requirements.

    The node addition process only requires the administrator to enter the IP addresses and its basically a one click, capacity is available immediately and there is no mass movement of data. There is also no need to move data off and recreate disk groups or similar as these legacy concepts & complexities do not exist in ADSF.

    Nutanix is also the only platform to allow expanding of capacity via Storage Only nodes and supports VMs which have larger capacity requirements than a single node can provide. Both are supported out of the box with zero configuration required.

    Interestingly, adding storage only nodes also increases performance, resiliency for the entire cluster as well as the management stack including PRISM.

  2. Impact & implications to data reduction of adding new nodesWith ADSF, there are no considerations or implications. Data reduction is truely global throughout the cluster and regardless of hypervisor or if you’re adding Compute+Storage or Storage Only nodes, the benefits particularly of deduplication continue to benefit the environment.

    The net effect of adding more nodes is better performance, higher resiliency, faster rebuilds from drive/node failures and again with global deduplication, a higher chance of duplicate data being found and not stored unnecessarily on physical storage resulting in a better deduplication ratio.

    No matter what size node/s are added & no matter what Hypervisor, the benefits from data reduction features such as deduplication and compression work at a global level.

    What about Erasure Coding? Nutanix EC-X creates the most efficient stripe based on the cluster size, so if you start with a small 4 node cluster your stripe would be 2+1 and if you expand the cluster to 5 nodes, the stripe will automatically become 3+1 and if you expand further to 6 nodes or more, the stripe will become 4+1 which is currently the largest stripe supported.

  3. Drive FailuresIn the event of a drive failure (SSD/SAS or SATA) as mentioned earlier, only that drive is impacted. Therefore to restore resiliency, only the data on that drive needs to be repaired as opposed to something like an entire disk group being marked as offline.

    It’s crazy to think a single commodity drive failure in a HCI product could bring down an entire group of drives, causing a significant impact to the environment.

    With Nutanix, a rebuild is performed in a distributed manner throughout all nodes in the cluster, so the larger the cluster, the lower the per node impact and the faster the configured resiliency factor is restored to a fully resilient state.

At this point you’re probably asking, Are there any decisions to make?

When adding any node, compute+storage or storage only, ensure you consider what the impact of a failure of that node will be.

For example, if you add one 15TB storage only node to a cluster of nodes which are only 2TB usable, then you would need to ensure 15TB of available space to allow the cluster to fully self heal from the loss of the 15TB node. As such, I recommend ensuring your N+1 (or N+2) node/s are equal to the size of the largest node in the cluster from both a capacity, performance and CPU/RAM perspective.

So if your biggest node is an NX-8150 with 44c / 512GB RAM and 20TB usable, you should have an N+1 node of the same size to cover the worst case failure scenario of an NX-8150 failing OR have the equivalent available resources available within the cluster.

By following this one, simple rule, your cluster will always be able to fully self heal in the event of a failure and VMs will failover and be able to perform at comparable levels to before the failure.

Simple as that! No RAID, Disk group, deduplication, compression, failure, or rebuild considerations to worry about.

Summary:

The above are just a few examples of the advantages the Nutanix ADSF provides compared to other HCI products. The operational and architectural complexity of other products can lead to additional risk, inefficient use of infrastructure, misconfiguration and ultimately an environment which does not deliver the business outcome it was originally design to.

Sizing infrastructure based on vendor Data Reduction assumptions – Part 2

In part 1, we discussed how data reduction ratios can, and do, vary significantly between customers and datasets and that making assumptions on data reduction ratios, even when vendors provide guarantees, does not protect you from potentially serious problems if the data reduction ratios are not achieved.

In Part 2 we will go through an example of how misleading data reduction guarantees can be.

One HCI manufacturer provides a guarantee promising 10:1 which sounds too good to be true, and that’s because it, quite frankly, isn’t true. The guarantee includes a significant caveat for the 10:1 data reduction:

The savings/efficiency are based on the assumption that you configure a backup policy to take at least one <redacted> backup per day of every virtual machine on every<redacted> system in a given VMware Datacenter with those backups retained for 30 days.

I have a number of issues with this limitation including:

  1. The use of the word “backup” referring directly/indirectly to data reduction (savings)
  2. The use of the word “backup” when referring to metadata copies within the same system
  3. No actual deduplication or compression is required to achieve the 10:1 data reduction because metadata copies (or what the vendor incorrectly calls “backups”) are counted towards deduplication.

It is important to note, I am not aware of any other vendor who makes the claim that metadata copies ( Snapshots / Point in time copies / Recovery points etc.) are deduplication. They simply are not.

I have previously written about what should be counted in deduplication ratios, and I encourage you to review this post and share your thoughts as it is still a hot topic and one where customers are being oversold/mislead regularly in my experience.

Now let’s do the math on my claim that no actual deduplication or compression is required to achieve the 10:1 ratio.

Let’s use a single 1 TB VM as a simple example. Note: The size doesn’t matter for the calculation.

Take 1 “backup” (even though we all know this is not a backup!!) per day for 30 days and count each copy as if it was a full backup to disk, Data logically stored now equals 31TB  (1 TB + 30 TB).

The actual Size on disk is only a tiny amount of metadata higher than the original 1TB as the metadata copies pointers don’t create any copies of the data which is another reason it’s not a backup.

Then because these metadata copies are counted as deduplication, the vendor reports a data efficiency of 31:1 in its GUI.

Therefore, Effective Capacity Savings = 96.8% (1TB / 31TB = 0.032) which is rigged to be >90% every time.

So the only significant capacity savings which are guaranteed come from “backups” not actual reduction of the customer’s data from capacity saving technologies.

As every modern storage platform I can think of has the capability to create metadata based point in time recovery points, this is not a new or even a unique feature.

So back to our topic, if you’re sizing your infrastructure based on the assumption of the 10:1 data efficiency, you are in for rude shock.

Dig a little deeper into the “guarantee” and we find the following:

It’s the ratio of storage capacity that would have been used on a comparable traditional storage solution to the physical storage that is actually used in the <redacted> hyperconverged infrastructure.  ‘Comparable traditional solutions’ are storage systems that provide VM-level synchronous replication for storage and backup and do not include any deduplication or compression capability.

So if you, for example, had a 5 year old NetApp FAS, and had deduplication and/or compression enabled, the guarantee only applies if you turned those features off, allowed the data to be rehydrated and then compared the results with this vendor’s data reduction ratio.

So to summarize, this “guarantee” lacks integrity because of how misleading it is. It  is worthless to any customer using any form of enterprise storage platform probably in the last 5 –  10 years as the capacity savings from metadata based copies are, and have been, table stakes for many, many years from multiple vendors.

So what guarantee does that vendor provide for actual compression and deduplication of the customers data? The answer is NONE as its all metadata copies or what I like to call “Smoke and Mirrors”.

Summary:

“No one will question your integrity if your integrity is not questionable.” In this case the guarantee and people promoting it have questionable integrity especially when many customers may not be aware of the difference between metadata copies and actual copies of data, and critically when it comes to backups. Many customers don’t (and shouldn’t have too) know the intricacies of data reduction, they just want an outcome and 10:1 data efficiency (saving) sounds to any reasonable person as they need 10x less than I have now… which is clearly not the case with this vendors guarantee or product.

Apart from a few exceptions which will not be applicable for most customers, 10:1 data reduction is way outside the ballpark of what is realistically achievable without using questionable measurement tactics such as counting metadata copies / snapshots / recovery points etc.

In my opinion the delta in the data reduction ratio between all major vendors in the storage industry for the same dataset, is not a significant factor when making a decision on a platform. This is because there are countless other substantially more critical factors to consider. When the topic of data reduction comes up in meetings I go out of my way to ensure the customer understands this and has covered off the other areas like availability, resiliency, recoverability, manageability, security and so on before I, quite frankly waste their time talking about table stakes capability like data reduction.

I encourage all customers to demand nothing less of vendors than honestly and integrity and in the event a vendor promises you something, hold them accountable to deliver the outcome they promised.