What’s .NEXT? – Erasure Coding!

Up to now, Nutanix has used a concept known as “Replication Factor” or “RF” to provide storage layer data protection as opposed to older RAID technologies.

RF allows customers to configure either 2 or 3 copies of data depending on how critical the data is.

When using RF2, the usable capacity of RAW is 50% (RAW divide 2).

When using RF3, the usable capacity of RAW is 33% (RAW divide 3).

While these sound like large overheads, but in reality, they are comparable to traditional SAN/NAS deployments as explain in the two part post – Calculating Actual Usable capacity? It’s not as simple as you might think!

But enough on existing features, lets talk about an exciting new feature, Erasure coding!

Erasure coding (EC) is a technology which significantly increases the usable capacity in a Nutanix environment compared to RF2.

The overhead for EC depends on the cluster size but for clusters of 6 nodes or more it results in only a 1.25x overhead compared to 2x for RF2 and 3x for RF3.

For clusters of 3 to 4 nodes, the overhead is 1.5 and for clusters of 5 nodes 1.33.

The following shows a comparison between RF2 and EC for various cluster sizes.ErasureCodingAs you can see, the usable capacity is significantly increased when using Erasure Coding.

Now for more good news, in-line with Nutanix Uncompromisingly Simple philosophy, Erasure Coding can be enabled on existing Nutanix containers on the fly without downtime or the requirement to migrate data.

This means with a simple One-click upgrade to NOS 4.5, customers can get up to a 60% increase in usable capacity in addition to existing data reduction savings. e.g.: Compression.

So there you have it, more usable capacity for Nutanix customers with a non disruptive one click software upgrade…. (your welcome!).

For customers considering Nutanix, your cost per GB just dropped significantly!

Want more? Check out how to scale storage capacity separately from compute with Nutanix!

Related Articles:

1. Nutanix Erasure Coding (EC-X) Deep Dive

What’s .NEXT? – Scale Storage separately to Compute on Nutanix!

Since I joined Nutanix, I have heard from customers that they want to scale storage (capacity) separate to compute as they have done in traditional SAN/NAS environments.

I wrote an article a while ago about Scaling problems with traditional shared storage which discusses why scaling storage capacity separately can be problematic. As such I still believe scaling capacity separately is more of a perceived advantage than a real one in most cases, especially with traditional SAN/NAS.

However here at Nutanix we have locked ourselves away and brainstormed how we can scale capacity without degrading performance and without loosing the benefits of a Nutanix Hyper-Converged platform such as Data Locality and linear scalability.

At the same time, we wanted to ensure doing so didn’t add any unnecessary cost.

Introducing the NX-6035c , a new “Storage only” node!

What is it?

The NX-6035c is a 2 node per 2 RU block, which has 2 single socket servers with 1 SSD and 5 x 3.5″ SATA HDDs and 2 x 10GB NICs for network connectivity.

How does it work?

As with all Nutanix nodes, the NX-6035c runs the Nutanix Controller VM (CVM) which presents the local storage to the Nutanix Distributed File System (NDFS).

The main difference between the NX-6035c and other Nutanix nodes is that it is not a member of the hypervisor cluster and as a result does not run virtual machines, but it is a fully functional member of the NDFS cluster.

The below diagram shows a 3 node vSphere or Hyper-V cluster with storage presented by a 5 node NDFS cluster using 3 x NX-8150s as Compute+Storage and 2 x NX-6035C nodes as Storage only.

6035cinndfscluster

Because the NX-6035c does not run VMs, it only receives data via Write I/O replication from Resliency Factor 2 or 3 and Disk Balancing.

This means for every NX-6035c in an NDFS cluster, the Write performance for the cluster increases because of the additional CVM. This is how Nutanix ensures we avoid the traditional capacity scaling issues of SAN/NAS.

Rule of thumb: Don’t scale capacity without scaling storage controllers!

The CVM running on the NX-6035c also provides data reduction capabilities just like other Nutanix nodes, so data reduction can occur with even lower impact on Virtual Machine I/O.

What about Hypervisor licensing?

The NX-6035c runs the CVM on a Nutanix optimized version of KVM which does not require any hypervisor licensing.

For customers using vSphere or Hyper-V, the NX-6035c provides storage performance and capacity to the NDFS cluster which serves the hypervisor.

This results is more storage capacity and performance with no additional hypervisor costs.

Want more? Check out how Nutanix is increasing usable capacity with Erasure Coding!

What’s .NEXT? – Acropolis! Part 1

By now many of you will probably have heard about Project “Acropolis” which was the code name for development where Nutanix decided to create an Uncompromisingly Simple management platform for an optimized KVM hypervisor.

Along the way with Nutanix extensive skills and experience with products such as vSphere and Hyper-V, we took on the challenge of delivering similar enterprise grade features while doing so in an equally scalable and performant platform as NDFS.

Acropolis therefore had to be built into PRISM. The below screen shot shows the Home screen for PRISM in a Nutanix Acropolis environment, looks pretty much the same as any other Nutanix solution right! Simple!

PRISMwAcropolis

So let’s talk about how you install Acropolis. Since its a management platform for your Nutanix infrastructure it is critical component, so do I need a management cluster? No!

Acropolis is built into the Nutanix Controller VM (CVM), so it is installed by default when loading the KVM hypervisor (which is actually shipped by default).

Because its built into the CVM, Acropolis (and therefore all the management components) automatically scale with the Nutanix cluster, so there is no need to size the management infrastructure. There is also no need to license or maintain operating systems for management tools, further reducing cost and operational expense.

The following diagram shows a 4 node Nutanix NDFS cluster running Nutanix KVM Hypervisor using Acropolis. One CVM per cluster is elected the Acropolis Master and the rest of the CVMs are Acropolis Slaves.

AcropolisCluster1

The Acropolis Master is responsible for the following tasks:

  1. Scheduler for HA
  2. Network Controller
  3. Task Executors
  4. Collector/Publisher of local stats from Hypervisor
  5. VNC Proxy for VM Console connections
  6. IP address management

Each Acropolis Slave is responsible for the following tasks:

  1. Collector/Publisher of local stats from Hypervisor
  2. VNC Proxy for VM Console connections

Acropolis is a truly distributed management platform which has no dependency on external database servers and is fully resilient with in-built self healing capabilities so in the event of node or CVM failures that management continues without interruption.

What does Acropolis do? Well, put simply, the things 95% of customers need including but not limited to:

  • High Availability (Think vSphere HA)
  • Load Balancing / Virtual Machine Migrations (Think DRS & vMotion)
  • Virtual machine templates
  • Cloning (Instant and space efficient like VAAI-NAS)
  • VM operations / snapshots / console access
  • Centralised configuration of nodes (think vSphere Host Profiles & vSphere Distributed Switch)
  • Centralized Managements of virtual networking (think vSphere Distributed Switch)
  • Performance Monitoring of Physical HW, Hypervisor & VMs (think vRealize Operations manager)

Summary: Acropolis combines the best of breed hyperconverged platform with an enterprise grade KVM management solution which dramatically simplifies the design, deployment and ongoing management of datacenter infrastructure.

In the next few parts of this series I will explore the above features and the advantages of the Acropolis solution.