Acropolis: Scalability

One of the major focuses for Nutanix both for our Distributed Storage Fabric (part of the Nutanix Xtreme Computing Platform or XCP) has been scalability with consistent performance.

Predictable scalability is critical to any distributed platform as it predictable scalability for the management layer.

This is one of the many strengths of the Acropolis management layer.

All components which are required to Configure, Manage, Monitor, Scale and Automate are fully distributed across all nodes within the cluster.

As a result, there is no single point of failure with the Nutanix/Acropolis management layer.

Lets take a look at a typical four node cluster:

Below we see four Controller VMs (CVMs) which service one node each. In the cluster we have an Acropolis Master along with multiple Acropolis Slave instances.

Acropolis4nodecluster1

In the event the Acropolis Master becomes unavailable for any reason, an election will take place and one of the Acropolis Slaves will be promoted to Master.

This can be achieved because Acropolis data is stored in a fully distributed Cassandra database which is protected by the Distributed Storage Fabric.

When an additional Nutanix node is added to the cluster, an Acropolis Slave is also added which allows the workload of managing the cluster to be distributed, therefore ensuring management never becomes a point of contention.Acropolis5NodeCluster

Things like performance monitoring, stats collection, Virtual Machine console proxy connections are just a few of the management tasks which are serviced by Master and Slave instances.

Another advantage of Acropolis is that the management layer never needs to be sized or scaled manually. There is no vApp/s , Database Server/s, Windows instances to deploy, install, configure, manage or license, therefore reducing cost and simplifying management of the environment.

Summary:

Acropolis Management is automatically scaled as nodes are added to the cluster, therefore increasing consistency , resiliency, performance and eliminating potential for architectural (sizing) errors which may impact manageability.

Note: For non-Acropolis deployments, PRISM is also scaled in the same manner as described above, however the scalability of Hypervisor management layers such as vCenter or SCVMM will need to be considered separately when not using Acropolis.

Can I use my existing SAN/NAS storage with Nutanix?

I question I get regularly is, “Can I use my existing SAN/NAS storage with Nutanix?”.

The short answer is, as always “It depends”.

  • iSCSI, NFS & SMB 3.0 can be presented to Nutanix nodes just like existing non Nutanix nodes.
  • FC based storage cannot be used as Nutanix does not support FC HBAs

The below diagram shows a Nutanix NX-3460 block w/ 4 nodes having both Nutanix Containers presented to the nodes as well as iSCSI LUNs , SMB 3.0 or NFS Mount points connected from the centralized SAN/NAS.

Note: SMB 3 is not supported for ESXi hosts & NFS is not supported for Hyper-V.

Nutanix w External iSCSi NFS  SMB Storage

So what is the use cases for this style of deployment?

If you’re not ready to do an entire infrastructure refresh for whatever reason/s, you may wish to transition to Nutanix over time while maximizing ROI and lifespan of you’re existing storage.

Here is some examples of what I recommend customers do:

1. Migrate Business Critical Applications (BCAs) to Nutanix

There are many benefits of doing this including:

  • Improving resiliency / performance for vBCAs
  • Simplifying storage management for vBCAs
  • Freeing up capacity and reducing the workload on legacy SAN
  • Increasing ease of scalability for critical workloads
  • Use legacy SAN/NAS for high capacity low IOPS workloads which are better suited to centralized storage than vBCAs

Another great option is

2. Migrate Virtual Desktops (VDI) to Nutanix which shares similar benefits to migrating vBCAs including:

  • Separating non complimentary VDI workloads from Server & vBCAs as these workloads do not mix well in centralized storage deployments
  • Improving resiliency / performance for VDI
  • Simplifying storage management for VDI
  • Reducing the workload on legacy SAN/NAS which will give an effective increase in performance for workloads remaining on the SAN/NAS
  • Increasing linear scalability for VDI for if/when the environment scales
  • Use legacy SAN/NAS for high capacity low IOPS workloads which are better suited to centralized storage than VDI

The last example I wanted to point out is Management workloads.

1. Migrate Infrastructure Management workloads to Nutanix.

As has been recommended by many industry experts, separating Management VMs from customer (e.g.: vCAC / vCloud tenants) or production server/desktop workloads (at both the Compute & Storage layers) can dramatically simplify the datacenter and help improve performance, resiliency & recoverability.

Again doing this provides similar benefits to the previous two examples.

  • Separating Management workloads from Server / vBCAs / VDI as these workloads should be separate from a security, resiliency, performance and recoverability perspectives.
  • Improving resiliency / performance for all workloads in the datacenter
  • Simplifying storage management for Management
  • Reducing the workload on legacy SAN/NAS which will give an effective increase in performance for workloads remaining on the SAN/SAN
  • Increasing scalability for if/when the management demands increase.
  • Maximizes the life span / performance of the legacy SAN/NAS

In summary, where it is not possible for budgetary reasons to migrate all workloads to Nutanix, migrating some workloads such as VDI, vBCA or Management to Nutanix will help alleviate the impact of scalability, performance and/or resiliency issues with your existing centralized SAN/NAS.

Nutanix also provides a solution which can start (very) small and continue to be scaled in a granular fashion over time until the SAN/NAS goes End of Life and/or when budget exists. At this time all workloads can then be migrated to Nutanix!

Virtualizing Business Critical Applications – The Web-Scale Way!

Since joining Nutanix back in July 2013, I have been working on testing the performance and resiliency of a range of virtual workloads including Business Critical Applications on the Nutanix platform. At the time, Nutanix only offered a single form factor (4 nodes in 2RU) which was not always a perfect fit depending on customer requirements.

Fast forward to August 2014 and now Nutanix has a wide range of node types to meet most workload requirements which can be found here.

The only real gap in the node types was a node which would support applications with large capacity requirements and also have a very large active working set which requires consistent low latency and high performance regardless of tier.

So what do I mean when I say “Active working Set”. I would define this as a data being regularly accessed by the VM/s, for example a file server may have 10TB of data, but users only access 10% on a regular basis. This 10% I would classify as the Active Working Set.

Now back to the topic at hand, The reason I am writing this post is because this has been a project I have been working towards for some time, and I am very excited about this product being released to the market. I have no doubt it will further increase the already fast up take of the Web-scale solutions and provide significant value and opportunities to new and existing customers wanting to simplify their datacenter/s and standardize on Nutanix Web-scale architecture.

Along with many others at Nutanix, we proposed a new node type (being the NX-8150), which has been undergoing thorough testing in my team (Solutions & Performance Engineering) for some time and I am pleased to say is being officially released (very) soon!

nx8050

What is the NX-8150?

A 1 Node per 2RU platform with the following specifications:

* 2 CPU Sockets with two CPU options (E5-2690v2 [20 cores / 3.0 GHz] OR E5-2697v2 [24 cores / 2.7 GHz]
* 4 x Intel 3700 Series SSDs (ranging from 400GB to 1.6TB ea)
* 20 x 1TB SATA HDDs
* Up to 768GB RAM
* Up to 4 x 10GB NICs
* 4 x 1GB NICs
* 1 x IPMI (Out of band Management)

What is the use case for the NX-8150?

Simply put, Applications which have high CPU/RAM requirements with large active working sets and/or the requirement for consistent high performance over a large data set.

Some examples of these applications include:

* Microsoft Exchange including DAG deployments
* Microsoft SQL including Always on Availability Groups
* Oracle including RAC
* SAP
* Microsoft Sharepoint
* Mixed Production Server Workloads with varying Capacity & I/O requirements

The NX-8150 is a great platform for the above workloads as it not only has fast CPUs and up to a massive 768GB of RAM to provide substantial compute resources to VMs, but also up to a massive 6.4TB of RAW SSD capacity for Virtual machines with high IO requirements. For workloads where peak performance is not critical the NX-8150 also provides solid consistent performance across the “Cold Tier” provided by the 20 x 1TB HDDs.

As with all Nutanix nodes, Intelligent Life-cycle management (ILM) maximizes performance by dynamically migrating hot data to SSD and cold data to SATA to provide the best of both worlds being high IOPS and high capacity.

One of the many major advantages of Nutanix Web-Scale architecture is Simplicity and its ability to remove the requirement for application specific silos! Now with the addition of the NX-8150 the vast majority of workloads including Business Critical Applications can be ran successfully on Nutanix, meaning less silos are required, resulting in a simpler, more cost effective, scalable and resilient datacenter solution.

Now with a number of customers already placing advanced orders for NX-8150’s to deploy Business Critical Applications, it wont be long until the now common “Virtual 1st” policies within many organisations turns into a “Nutanix Web-Scale 1st” policy!

Stay tuned for upcoming case studies for NX-8150 based Web-Scale solutions!