Think HCI is not an ideal way to run your mission-critical x86 workloads? Think again! – Part 2

Now continuing from Part 1, lets look at another one of VCE COO Todd Pavone’s statements from the COO: VCE converged infrastructure not affected by Dell-EMC article:

We believe that there was a major gap in the core data center for hyper-converged, where customers wanted hyper-converged architecture — they don’t want to invest in tier-one storage or tier-one servers. They want the intelligence in the software, but they also want massive scale. This is for globals, large service providers in a massive scale, like thousands of nodes. We have a large financial service company in New York that is using us for a platform-free application build-up. And they want to pilot it with 10,000 users, but it’s going to go to 10 million users. And so, can we give them an infrastructure for 10,000, but can scale simply and easily to 10 million — or 20 million?

You can’t do that on an appliance, right? But they want hyper-converged. When you get to 10 million users, you want an infrastructure that scales and is nonlinear, leading to a lower cost model. So, we said, “There’s a gap in that market,” and we created the rack.

Let’s again address these points:

  • Todd: “They don’t want to invest in tier-one storage or tier-one servers. They want the intelligence in the software, but they also want massive scale.”

If customers don’t want to invest in what I would call “traditional” tier one storage and servers, them I’d have to agree with them they need a very different solution, such as Nutanix if they want to get to massive scale, especially if they want easy management & deployment.

Nutanix has customers ranging from 3 to thousands of nodes, in fact many of our large customers run Acropolis Hypervisor. So any question about scalability for Nutanix is just laughable.

  • Todd: “And they want to pilot it with 10,000 users, but it’s going to go to 10 million users. And so, can we give them an infrastructure for 10,000, but can scale simply and easily to 10 million — or 20 million? You can’t do that on an appliance, right?”

Well, you can with Nutanix! In fact that sounds like a common use case for Nutanix, we frequently design and pilot repeatable models and then scale as required.

  • Todd: “But they want hyper-converged. When you get to 10 million users, you want an infrastructure that scales and is nonlinear, leading to a lower cost model. So, we said, “There’s a gap in that market,” and we created the rack.”

It’s no surprise to me at all that customers want Hyperconverged and the ability to scale both linearly and non linearly. Nutanix can do this today and has been able to do it for a long time. Back in 2013 for example, you could mix NX3000 series being Compute heavy / Storage Light with NX6000 nodes which are Compute light and Storage Heavy. This is an example of non linear scaling which achieves the reduced cost (e.g.: Cost/GB) over time.

Then in 2014 an even wider range of nodes were released (NX1000, NX3000, NX6000 & NX8000) which enhanced Nutanix ability to scale both up and out, linearly and non linearly.

In 2015 Nutanix launched the NX-6035C “Storage Only” node which allows customers to Scale Storage separately to Compute, ensuring non linear scaling compute vs storage for customers with high capacity requirements. Importantly, no hypervisor licensing is required to scale storage as storage only nodes run Acropolis Hypervisor (AHV) which is fully interoperable with ESXi and Hyper-V environments.

Remember the Rule of thumb: Don’t scale capacity without scaling storage controllers!

Nutanix Storage Only nodes run a light weight Controller VM (CVM) to ensure Management, Monitoring and Data services (e.g.: Disk Balancing, Compression, Dedupe, Erasure Coding etc) do not degrade even when scaling compute and storage in a vastly non linear manner. Storage only nodes also help improve performance by participating in cluster replication (RF2/RF3) and disk balancing activities.

  • Todd: “So, we said, “There’s a gap in that market,” and we created the rack.”

There may have been a gap back in early 2013, but since then Nutanix has continued to innovate and lead the market with solutions to scale both linearly and non linearly, I’d say the gap has long been filled. Nutanix also scales management with a single HTML 5 GUI called PRISM, with central management of multiple clusters/sites/geographical locations via PRISM central.

Summary:

I’m sure it’s pretty obvious by now VCE COO Todd Pavone and I have different opinions on what HCI is capable of. During my time at Nutanix I have seen countless successful small, medium and large scale mission-critical application deployments and the percentage of Nutanix business from these workloads continues to increase thanks to our investment in a dedicated vBCA team which I am fortunate to be a part of.

Next time you’re considering new infrastructure for mission critical application, reach out and I’ll happily work with you and see if Nutanix is a good fit for your use case.

Let me finish by saying, I can guarantee you that if in the unlikely event the workload/s are not suitable for Nutanix, I will be the first one to tell you, and help you find an alternate solution.

Back to Part 1.

Fight the FUD: Nutanix scale limitations

I was reading COO: VCE converged infrastructure not affected by Dell-EMC on TechTarget this morning and came across the following quote from VCE COO Todd Pavone which I found a little amusing.

One of the risks that we see in the marketplace for these appliance players is they’re trying to take that appliance that’s been architected for what I think are more single, simple, edge use cases, and they’re trying to put those into the core. We said, “Rather than trying to do that, we’re going to build an architecture for scale.” Because if you study Nutanix and <Redacted>, any of these companies that we know really well, they have scale limitations. They get to certain nodes sizes, and they break. And then, you have to cut another cluster, you have to cut another cluster.

That’s not ideal for a core data center, because now, you’re managing all of them individually — you can’t tie them into your other core systems. And so, now, you have proliferating silos, which for us is … we think that’s a big no-no. Your operational costs aren’t going to improve.

What doesn’t surprise me is how much focus Nutanix gets from other vendors, especially EMC/VCE. Its a great validation of the success of the Nutanix platform and a great indication of what will be dominant datacenter architecture (Hyperconvered/HCI) and what platform will lead the market (Nutanix XCP) in the future.

As for this post, I will only speak about Nutanix Xtreme Computing Platform (XCP) and not about the other vendor he mentioned as I don’t see the value in talking about other vendors.

The below is my summary of the points Todd has made and my thoughts:

  • Todd: Nutanix has scale limitations

Josh: Nutanix has no Maximum cluster size (nodes per cluster). In fact, as the Nutanix Distributed Storage Fabric scales, the Write I/O continues to be distributed further meaning higher Write performance.

In this article (Why Nutanix Acropolis hypervisor (AHV) is the next generation hypervisor – Part 3 – Scalability) I cover all aspects of scalability including Management, Performance, Capacity, Resiliency and how scaling effects Operational aspects.

While the above post is focusing on Acropolis Hypervisor (AHV), the scalability is also true when using other supported Hypervisors such as ESXi and Hyper-V within the limitations of those hypervisors.

I wonder if Todd would say vSphere has “Scale limitations” being they support clusters of 64? Probably not, he wouldn’t want to FUD VMware.

Update: Pretty timely claim by Todd when Nutanix has just delivered a >100 node, 2PB solution used for mixed workloads such as eDiscovery for Legal, High Performance SQL, MS Exchange and more.

Nutanix2PB

  • Todd: They get the certain node sizes and they break?

Josh: I believe Todd may have been referring to “Cluster sizes” as opposed to “Node sizes” but as he is unfamiliar with Nutanix technology he is using incorrect terminology.

The first point covers “cluster” sizing, now I’ll cover nodes sizing. Nutanix along with Dell and Lenovo has numerous different node configurations which range from one to four CPU sockets and up to 768G RAM with various SSD/HDD combinations including All-Flash.

There is not a node size maximum for the Acropolis Base Software (formally known as NOS), its simply a matter of practicality. Nutanix is a distributed platform, not a legacy monolithic centralised platform. As such, scaling out is by design to improve things like resiliency and performance.

Nutanix also recommends against scaling up as this increases the impact in the event of a single node failure. e.g.: A 3 node cluster has an impact of 33% with one node failure, but an 8 node cluster has only a 12.5% impact with one failure.

  • Todd: They get to certain nodes sizes, and they break. And then, you have to cut another cluster, you have to cut another cluster.

Josh: Apart from repeating himself and using the term “node” incorrectly (again), Todd is implying Nutanix forces you to create new clusters at a given scale (which he fails to mention). As I mentioned earlier, Nutanix has no Maximum cluster size (nodes per cluster).

But as any good architect knows, there are considerations such as failure domains, security and constraints where having multiple clusters may be required or simply advantageous. One of the many great things about Nutanix XCP is multiple clusters (even with different hypervisors) can be managed centrally with PRISM central.

That brings us nicely to Todd’s next point:

  • Todd: That’s not ideal for a core data center, because now, you’re managing all of them individually

Josh: This statement is the last part of the quoted section, and again Todd is talking management of “nodes” as opposed to clusters. So first point, Nutanix XCP requires 3 nodes to form a cluster and that cluster managed via PRISM Element. Where multiple clusters exist, PRISM central is then used as a single pane of glass to manage all clusters.

The below is a video showing PRISM Element for two clusters then joining them to a PRISM central instance for central management. Note: This is a fairly old video (posted September 22, 2014) as Nutanix has been doing this for a long time, as such, PRISM Element and Central have been enhanced since this was created.

Here is an example of scaling Nutanix VDI for 20K to 200K+ Power User Desktops. It is a good example showing a real world design with Management clusters and VDI clusters which takes into consideration failure domains. This also follows well proven and accepted best practices for VMware Horizon View deployments, where the scale limitations are at the vSphere/Horizon layer, not the Nutanix layer.

Summary:

This is yet another example of one vendor talking nonsense about a vendor they compete with. If its reliable information your after, speak to the vendor who makes the product/s your interested in, get them to tell you about the product then ask to speak with reference customers to validate the information you have been provided.

Competitive vendors will only focus on what they perceive to be the issues with a given competitors platform. A good vendor will focus on their product and not discuss competitors even when asked for comparisons by customers.

To quote a person I have learnt a lot from while at Nutanix, “While our competitors focus on us, We are focusing on our customers”Dheeraj Pandey Nutanix Founder and CEO.

FocusOnCustomers

Fight the FUD!

Follow up posts:

For more information about Nutanix XCP scalability see the following posts:

1. Why Nutanix Acropolis hypervisor (AHV) is the next generation hypervisor – Part 3 – Scalability

2. Scaling Hyper-converged solutions – Compute only.

3. Scale Storage separately to Compute on Nutanix!