Nutanix Implementation of Data Avoidance & Reduction Technologies

While its not news that Nutanix Distributed Storage Fabric (NDSF) supports numerous data avoidance & reduction technologies, what is less well known is how these technologies can be enabled/disabled and used.

Before we begin, let me cover off what technologies NDSF offers:

Data Avoidance:

  • VAAI-NAS Fast File Clone (for ESXi)
  • View Composer for Array Integration (VCAI) for Horizon View
  • Native NDSF Clones (ESXi, Hyper-V and AHV)
  • ODX Copy Offload (Hyper-V)
  • Crash and Application Consistent snapshots (ESXi, Hyper-V and AHV)

Data Reduction:

  • Compression (In-Line and Post-Process)
  • Deduplication (Fingerprint on Write/In-Line for Performance Tier and/or Capacity Tier)
  • Erasure Coding (EC-X)

Data avoidance is designed to prevent the creation of unnecessary data which removes the requirement to leverage data reduction technologies. This means less work for the storage layer which results in more available front end IO to service the virtual machines.

An example of data avoidance is using VCAI with Horizon View to create Linked Clones near instantly which not only reduces space but ensures faster deployment and recompose activities with greatly reduced impact to the environment.

Data avoidance is greatly underrated in my opinion, as it results in lower compression/deduplication ratios, because there is no additional data to dedupe or compress. If Nutanix turned off these data avoidance technologies, it would result in HIGHER compression and dedupe ratios, which sounds great on a marketing slide or in a tweet, but in reality, avoiding work for the storage is a much better way to do things.

Some vendors report data avoidance such as snapshots in deduplication ratios, and this in my opinion is very misleading and designed to artifically inflate dedupe ratios for competitive purposes. For more information see: Deduplication ratios – What should be included in the reported ratio?

Data Reduction is still a valuable option to have but in my opinion its overrated. The reason I think its overrated is data reduction does not always work well. It greatly depends on your data type if you will see a good data reduction ratio or not, AND if the overheads (of which there is always an overhead) are worth it.

Let’s now focus on the NDSF implementation of Data Reduction technologies.

Compression:

Compression can be configured on new or existing containers and be set to In-Line or Post-Process. For post process, enter a “Delay” value e.g.: 60 to delay compression for 1 Hour, or 3600 for 1 day.

Compression

Compression can be reconfigured at any time, without the requirement to relocate VMs or reformat the storage. For data which is already compressed it will be uncompressed as part of a low priority background task (known as Curator). This ensures there is low/no impact of changing Compression settings, ensuring maximum flexibility for customers.

Because compression is configured per container, you can have VMs or even Virtual Disks running compression alongside VMs or Virtual Disks not running compression within the same NDSF cluster. This helps eliminate silos and ensures mixed workloads with different data types/profiles can co-exist efficiently.

Deduplication:

As with Compression, Deduplication can be configured on new or existing containers and be set to dedupe for the performance tier (SSD) and optionally for the Capacity (HDD) Tier. This means data reduction can be maximised for either or both tiers depending on customer requirements.

dedupeconfig

Again the same as Compression, Dedupe can be reconfigured at any time, without the requirement to relocate VMs or reformat the storage. For data which is already deduped the same low priority background task (Curator) rehydrates the data again ensuring there is low/no impact of changing dedupe settings and ensuring maximum flexibility for customers.

Because dedupe is configured per container, you can have VMs or even Virtual Disks running dedupe alongside VMs or Virtual Disks not running dedupe within the same NDSF cluster. Deduplication is also complimentary to Compression, meaning both can be ran at the same time to maximise data reduction and further eliminate silos ensuring mixed workloads can co-exist efficiently.

Erasure Coding (EC-X):

As with Compression & Dedupe, EC-X is enabled on a per container basis and is complimentary to both Compression and Dedupe. EC-X is a post-process only form of data reduction designed to work on Write cold data (meaning data which is not changing).

EC-X applies to data across the Performance Tier (SSD) and the Capacity Tier (SATA) which means the effective SSD capacity is increased, which means more data can be serviced by SSD, thus increasing performance.

ecxonoff

As previously discussed, NDSF supports different containers using different combinations of data reduction all within the same NDSF cluster to maximise efficiencies and eliminate unnecessary silos.

Summary:

Nutanix provides multiple technologies to minimise the data being stored on the distributed storage fabric while giving customers the flexibility to enable/disable and tune data reduction settings to suit different data profiles all within the same NDSF cluster.

Remember, “one size does not fit all” so it is importaint for the storage layer to be able treat your workloads differently based on their individual requirements.

Related Articles:

Fight the FUD: Nutanix Erasure Coding Efficiency

Every now and again you will see one vendor put out information/statements about other vendors technology. 9 times out of 10 its either outdated , incorrect or a deliberate attempt to spread Fear Uncertainty and Doubt (FUD).

Today I discovered something on LinkedIn I thought I would respond too, especially as it was mostly by two sales guys (One Sales Engineer & One Sales Director) from one vendor and two other individuals from other vendors trying to spread FUD.

Two of these vendors according to Gartner, are niche players and the other vendor didn’t even make the quadrant shown below.

intgegrated_systems_magic_quadrant_august_2015

Had the sales director simple googled Nutanix Erasure Coding he would have found the following articles which covers all of his questions and provides links to further articles on the topic. But hey, doing that would prevent him being able to spread FUD.

Nutanix – Erasure Coding (EC-X) Deep Dive

The above article refers to the below article which explains what data Nutanix EC-X will take effect on and discussed performance impact.

What I/O will Nutanix Erasure coding (EC-X) take effect on?

But let’s quickly address each point and correct the mis-information:

The “problems” the sales director has with the technical implementation of Nutanix EC-X are as follows, I will respond in-line.

  • Nutanix gets to decide if the data is hot or cold.

Not sure how this is a problem, would he prefer customers have to manually select data to be considered cold? I think the distributed file system tracking what data hasn’t been written too is a very simple, accurate and totally automated way to decide what data to apply . After all Nutanix is making infrastructure invisible, so yes, We’ll put the engineering work in so the customers can just wear the Nutanix grin. (sorry that was cheesy!)

  • What happens when I need that data back in production…. I can’t read it natively, so I am going to have to completely rehydrate it to read it again?

EC-X does not remove the data from production! Data which has EC-X applied is not moved to a LUN (lol!). Data remains accessible in the same way it was prior to EC-X taking effect. On read I/O data is not rehydrated, EC-X is simply a more space efficient method of storing data while proving resiliency of N+1 or N+2. EC-X and RF are applied on the same container so the data is not moved when EC-X is applied.

  • I still have to buy enough storage to size my environment correctly the first time around, with no dedupe,no compression, no nothing… so I’m only making my storage last a bit longer to eke a little more life out of it. It is not solving the problem!

Firstly, without stating what “the problem” is, the statement has no context and is pointless FUD. However I can confirm EC-X works in addition to compression & dedupe both of which can be in-line or post process.  All three data reduction technologies also apply to both the SSD and SATA tiers, just to get in-front of any future FUD.

Nutanix recommends customers start small and scale as required since our platform scales so gracefully, but if a customer wants to size for 3-5 years up front (we would help them avoid this BTW) we make assumptions (like every vendor, BTW) as to typical data reduction savings based on the information we have about the customer workload, and size with suitable capacity for at least N+1 to enable fully automated self healing from a node failure.

  • I can only erasure code very certain, specific workloads. This could be a very small amount of data.

Nutanix EC-X can apply to ANY data stored on the Nutanix Distributed Storage Fabric. As per the Deep Dive post (which this guy clearly didn’t read), Nutanix chooses to apply EC-X to data which is write cold for 60 mins to avoid the inefficiencies of striping data across nodes then having to re-stripe it shortly after following a subsequent write I/O. RF2 (or RF3) is more efficient for write intensive workloads and because Nutanix understands this, we only apply EC-X to non write intensive I/O.

  • I have a known high overhead on Nutanix anyway, so by using erasure coding, post process, I am reducing even further the amount of resources available to VMs.

Another baseless statement, But lets talk about the amount of resources available to VMs. The CVM size does not increase when EC-X is enabled, and the fact EC-X increases the effective capacity of the SSD tier, it means more data can be served out of SSD. What this results in is lower latency for a larger working set which REDUCES the CPU WAIT for the CVM and for all VMs performing I/O. Less data being stored (up to 2x less with RF3) means less metadata needs to be maintained, so the overheads on the CVM in many ways are reduced.

If Erasure Coding is applied in-line (which BTW Nutanix can do with a simple toggle of a setting, but chooses not too), it means that for write intensive workloads, stripes need to be recalculated frequently which is a high CPU overhead compared to, in Nutanix case RF2 or RF3.

Oh did I mention with EC-X the parity data is stored in the SATA tier, freeing up the SSD tier for even more data to be served with flash performance, this is another example of the increased efficiencies when using EC-X.

  • I’m still only doing this on a local basis, not globally, those inefficiencies continue to abound.

Ah, just plain wrong! EC-X is applied globally across the entire cluster with only one part of any EC-X stripe per node, ensuring maximum efficiency & resiliency.

Now to reply to one of the funnier comments:

  • I agree with Alan. IMO, any HCI vendor that offers erasure coding is essentially saying they cannot do in-line deduplication and compression at-speed. So they have to give you an alternative to get storage efficiency using a post-process like erasure coding. However, they still take the storage performance “hit” of having to read-in all the data, perform the calculations, and write it all back out again. This reminds me of how NetApp did post-process deduplication. Customers didn’t like the performance hit, you could only run so many jobs at any given time, and dedup jobs would constantly run-over their schedule and impact the following morning’s performance. Many customers would simply forgo the deduplication process to avoid the resulting headaches.HCI vendors who can perform the data efficiency in-line & at-speed – thus bypassing the need for any kind of post-process – will have a clear advantage over their competition.

So this guy is also saying In-line is best for Erasure Coding as well as dedupe and compression. Well since Nutanix can and does in many cases recommend In-Line dedupe and compression its a bit of a moot point?

Erasure Coding on the other hand, I believe post process based on I/O profile is a more efficient way, as described in What I/O will Nutanix Erasure coding (EC-X) take effect on?

Sure there is an overhead of doing post process, but there is also an overhead on doing in-line which this guy seems to be forgetting. The overhead of in-line is 100% of the I/O suffers the overhead (since its in-line), with post-process applied only to suitable data (being write cold data) the overhead only applies to write cold data, which dramatically reduces the overheads because only the most suitable data for EC-X get processed.

If a customer had 100% Write Once Read Many data, In-line would be more efficient, and Nutanix would configure EC-X in-line. If however data is write hot for the business day, then becomes cold and read only overnight, post process would be orders of magnitude more efficient as the stripes would only be calculated once, as opposed to “N” times depending on how write intensive the data was during he day.

Long story short, In-line and Post-Process both have their use cases, in my experience, most production workloads suit post process erasure coding which is why Nutanix default is post process for write cold data >60mins.

Comparing Nutanix, a HCI distributed platform to Netapp which is a centralised non HCI filer is a bit ridiculous as what does/doesn’t work well for Netapp has nothing to do with Nutanix.

Summary: 

The methods the Sales Director is using to spread completely incorrect information in an attempt to create FUD are just a little bit __________ (insert here).

I’d recommend customers/prospects ignore any comments from any vendor being made about another vendor period. If a vendor is spending there time talking about another vendor, politely ask them to leave and invite the vendor being spoken about to come and present as that technology is probably pretty good if other vendors feel the need to talk about it!

For the record, as the LinkedIn thread may “disappear” as a result of this post, the screen shots are below:

FUD1  FUDcomments1

FUDcomment2

RF2 & RF3 Usable Capacity with Erasure Coding (EC-X)

Over the past few weeks with the release of Acropolis base version 4.5 (formally known as NOS) on the horizon there has been a lot of interest in Erasure Coding (EC-X) which was announced at Nutanix .NEXT conference in June this year.

The most common questions are how does EC-X increase the effective SSD tier capacity and the overall cluster usable capacity. This post aims to cover these questions.

Resiliency Factor 2 (RF2) & Erasure Coding

Resiliency Factor 2 ensures that two copies of all data are written to persistent media prior to being acknowledged to the guest operating system. This ensures at N+1 level of redundancy which translates to being able to tolerate a single failure.

RF2 provides a usable capacity of ~50% of RAW.

The below figure shows an example of RF2 where six blocks store three pieces of data in a redundant fashion. In this configuration a single SSD/HDD or node can be lost without impacting data availability.

RF2normal

Now let’s take a look at how the same 6 blocks will be utilized with Erasure Coding enabled:

RF2plusECX

As we can see, we are now able to store four pieces of data (A,B,C,D) with single parity to ensure data can be rebuilt in the event of a drive or node failure. As with standard RF2, an RF2 + EC-X configuration can also tolerate a single SSD/HDD or node can be lost without impacting data availability. We also free up space to be used for another EC-X stripe.

As a result, the usable capacity increases from approx. 50% usable up to 80% usable for clusters of six (6) or larger.

The following table shows the maximum usable capacity for RF2 + EC-X based on cluster size:

Note: Assumes 20TB RAW per node

RF2table

Resiliency Factor 3 (RF3) & Erasure Coding

Resiliency Factor 3 ensures that three copies of all data are written to persistent media prior to being acknowledged to the guest operating system. This ensures at N+2 level of redundancy which translates to being able to tolerate two concurrent SSD/HDD or node failures.

RF3 provides a usable capacity of ~33% of RAW.

The below figure shows an example of RF3 where six blocks store two pieces of data in a redundant fashion. In this configuration the environment can tolerate two concurrent SSD/HDD or node failures without impacting data availability.

RF3normal

Now let’s take a look at how the same 6 blocks will be utilized with Erasure Coding enabled:

RF3ECX

Similar to the RF2 example, we can see we are now able to store more data with the same level of redundancy. In this case, five pieces of data (A,B,C, D) with dual parity to ensure data can be rebuilt in the event of dual concurrent drive or node failures. As with standard RF3, an RF3 + EC-X provides an N+2 level of availability while providing higher usable capacity.

The following table shows the usable capacity for RF3 + EC-X based on cluster size:

Note: Assumes 20TB RAW per node

RF3ECXtable

EC-X Parity Placement

To further increase the effective capacity of the SSD tier and there for supporting larger working set sizes with all flash performance, the Parity for containers with EC-X enabled is stored on the SATA tier.

The following figure shows a standard RF3 deployment:

RF3parityNormal

As we can see, 6 blocks of storage contain just 2 actual pieces of user data all of which reside in the SSD tier.

With RF3 + EC-X the same 6 blocks of storage contain 4 pieces of user data thus increasing the effective capacity of the SSD tier by 100% due to being able to store 4 piece of data compare to two with RF3. In addition the effective SSD capacity is further increased by moving the 2 parity blocks to SATA freeing up a further 33% SSD tier capacity.

RF3ECXparity

I hope that explains how EC-X works and why its such an advantage for Nutanix current and futures customers.

Related Articles:

  1. Nutanix Erasure Coding Deep Dive
  2. Increasing resiliency of large clusters with Erasure Coding
  3. What I/O will EC-X take effect on?
  4. Sizing assumptions for solutions with Erasure Coding (EC-X)