Solving Oracle & SQL Licensing challenges with Nutanix

The Nutanix platform has and will continue to evolve to meet/exceed the ever increasing customer and application requirements while working within constraints such as licensing.

Two of the most common workloads which I work frequently with customers to design solutions around real or perceived licensing constraints are Oracle and SQL.

In years gone by, Nutanix solutions were constrained to being built around a limited number of node types. When I joined in 2013 only one type existed (NX-3450) which limited customers flexibility and often led to paying more for licensing than a traditional 3-tier solution.

With that said, the ROI and TCO for the Nutanix solutions back then were still more often than not favourable compared to 3-tier but these days we only have more and more good news for prospective and existing customers.

Nutanix has now rounded out the portfolio with the introduction of “Compute Only” nodes to target a select few niche workloads with real or perceived licensing and/or political constraints.

Compute only nodes compliment the traditional HCI nodes (Compute+Storage) as well as our unique Storage Only Nodes which were introduced in mid 2015.

So how do Compute Only nodes help solve these licensing challenges?

In short, Oracle leads the world in misleading and intimidating customers into paying more for licensing than what they need to. One of the most ridiculous claims is “You must license every physical CPU core in your cluster because Oracle could run or have ran on it”.

The below tweet makes fun of Oracle and shows how ridiculous their claim that customers need to license every node in a cluster (which I’ve never seen referenced in any actual contract) is.

So let’s get to how you can design a Nutanix solution to meet a typical Oracle customer licensing constraint while ensuring excellent Scalability, Resiliency and Performance.

At this stage we now assume you’ve given your first born child and left leg to Oracle and have subsequently been granted for example 24 physical core licenses from Oracle, what next?

If we we’re to use HCI nodes, some of the CPU would be utilised by the Nutanix Controller VM (CVM) and while the CVM does add a lot of value (see my post Cost vs Reward for the Nutanix Controller VM) you may be so constrained by licensing that you want to maximise the CPU power for just Oracle workloads.

Now in this example, we have 24 licensed physical cores, so we could use two Compute Only nodes using an Intel Gold 6128 [6 cores / 3.4 GHz] / 12 cores per server for 24 total physical cores.

Next we would assess the storage capacity, resiliency and performance requirements and decide how many and what configuration storage only nodes are required.

Because Virtual Machines cannot run on storage only nodes, the Oracle Virtual Machines cannot and will never run on any other CPU cores other than the two Compute Only nodes therefore you would be in compliance with your licensing.

The below is an example of what the environment could look like.

2CO_4SOnodes

SQL has ever changing CPU licensing models which in some cases are licensed by server or vCPU count, Compute Only can be used in the same way I explained above to address any SQL licensing constraints.

What about if I need to scale storage capacity and/or performance?

You’re in luck, without any modifications to the Oracle workloads, you can simply add one or more storage only nodes to the cluster and it will almost immediately increase capacity, performance and resiliency!

I’ve published an example of the performance improvement by adding storage only nodes to a cluster in an article titled Scale out performance testing with Nutanix Storage Only Nodes which I wrote back in 2016.

In short, the results show by doubling the number of nodes from 4 to 8, the performance almost exactly doubled while delivering low read and write latency.

What if you’ve already invested in Nutanix HCI nodes (example below) and are running Oracle/SQL or any other workloads on the cluster?

TypicalHCIcluster

Nutanix provides the ability to convert a HCI node into a Storage Only node which results in preventing Virtual Machines from running on that node. So all you need to do is add two or more Compute Only nodes to the cluster, then mark the existing HCI nodes as Storage Only and the result is shown below.

CO_PlusConvertedHCI

This is in fact the minimum supported configuration for Compute Only Environments to ensure minimum levels of resiliency and performance. For more information, check out my post “Nutanix Compute Only Minimum requirements“.

Now we have two nodes (Compute Only) which can run Virtual Machines and four nodes (HCI nodes converted to Storage Only) which are servicing the storage I/O. In this scenario, if the HCI nodes have unused CPU and/or RAM the Nutanix Controller VM (CVM) can also be scaled up to drive higher performance & lower latency.

Compute Only is currently available with the Nutanix Next Generation Hypervisor “AHV”.

Now let’s cover off a few of the benefits of running applications like Oracle & SQL on Nutanix:

  1. No additional Virtualization licensing (AHV is included when purchasing Nutanix AOS)
  2. No rip and replace for existing HCI investment
  3. Unique scale out distributed storage fabric (ADSF) which can be easily scaled as required
  4. Storage Only nodes add capacity, performance and resiliency to your mission critical workloads without incurring additional hypervisor or application licensing costs
  5. Compute Only allows scale up and out of CPU/RAM resources where applications are constrained by ONLY CPU/RAM and/or application software licensing.
  6. Storage Only nodes can also provide functions such as Nutanix Files (previously known as Acropolis File Services or AFS)

As a result of Nutanix now having HCI, Storage Only and Compute Only nodes, we’re now entering the time where Nutanix can truely be the standard platform for almost any workload including those with non technical constraints such as political or application licensing which have traditionally been at least perceived to be an advantage for legacy SAN products.

The beauty of the Nutanix examples above is while they look like a traditional 3-tier, we avoid the legacy SAN problems including:

1. Rip and Replace / High Impact / High Risk Controller upgrades/scalability
2. Difficulty in scaling performance with capacity
3. Inability to increase resiliency without adding additional Silos of storage (i.e.: Another dual controller SAN)

With Compute Only being supported by AHV, we also help customers avoid the unnecessary complexity and related operational costs of managing ESXi deployments which have become increasingly more complex over time without significantly improving value to the average customer who simply wants high performance, resilient and easy to manage virtualisation solution.

But what about VMware ESXi customers?

Obviously moving to AHV would be ideal but for those who cannot for whatever reasons can still benefit from Storage Only nodes which provide increased storage performance and resiliency to the Virtual machines running on ESXi.

Customers can run ESXi on Nutanix (or OEM / Software Only) HCI nodes and then scale the clusters performance/capacity with AHV based storage only nodes, therefore eliminating the need to license both ESXi and Oracle/SQL since no virtual machine will run on these nodes.

How does Nutanix compare to a leading all flash array?

For those of you who would like to see a HCI only Nutanix solution have better TCO as well as performance and capacity than a leading All Flash Array, checkout A TCO Analysis of Pure FlashStack & Nutanix Enterprise Cloud where even with giving every possible advantage to Pure Storage, Nutanix still comes out on top without data reduction assumptions.

Now consider that Nutanix the TCO as well as performance and capacity was better than a leading All Flash Array with only HCI nodes, imagine the increased efficiency and flexibility by being able to mix/match HCI, with Storage Only and Compute only.

This is just another example of how Nutanix is eliminating even the corner use cases for traditional SAN/NAS.

For more information about Nutanix Scalability, Resiliency and Performance, checkout this multi-part blog series.

Nutanix Scalability – Part 3 – Storage Performance for a single Virtual Machine

Continuing on from Part 1 where we discussed how Nutanix can scale storage capacity separately to compute using storage only nodes, we will now cover how Nutanix can scale the storage performance of a Virtual server including beyond the capabilities of a single (scaled up) node.

Virtual machines, just like traditional physical servers benefit from having multiple storage controllers (e.g.: RAID controllers) and multiple drives regardless of their type (e.g.: HDD/SSD etc).

The same is true for VMs on Nutanix ADSF, more storage controllers and more virtual disks increase the storage performance.

For traditional hypervisors such as ESXi and Hyper-V, ensure you have the maximum four Paravritual SCSI controllers (PVSCSI) assigned to VM’s requiring the highest performance and low latency. Having multiple controllers means more queues are available to the virtual disks and therefore the less bottlenecks which can cause latency and inefficiency for the vCPUs assigned to the VM.

Because the benefits of multiple virtual SCSI adapters is so significant, Nutanix decided to ensure this functionality is achieved by default when using Nutanix’ next Generation Hypervisor, AHV with what is known as “Turbo Mode“.

This means virtual machines running on AHV are optimised by default at the virtual storage controller layer, removing the complexity for customers having to understand and configure virtual storage controllers.

Regarding Virtual disks, for optimal performance you’ll need to use at least four assuming you’re using the recommended four Paravirtual SCSI controllers (i.e.: One per SCSI controller) but since we’re talking about scaling performance, let’s talk about more extreme examples.

Let’s say we have an MS Exchange server with 20 databases, the performance requirements for each database is typically in the range of hundred of IOPS, in which case I would recommend one virtual disk (e.g.: VMDK) per database and another for the logs.

In the case of a large MS SQL server which may require tens or hundreds of thousands of IOPS to a single database, I recommend using multiple vDisks per database which involves Splitting SQL datafiles across multiple VMDKs to optimise VM performance.

In both examples, the virtual disks would be spread evenly across the four PVSCSI controllers if using ESXi or Hyper-V whereas AHV customers would just create the vDisks and each vDisk would by default enjoy a dedicated path direct to Nutanix ADSF’s I/O engine called “stargate”.

For more information about configuring virtual storage controllers and multiple virtual disks, see: SQL & Exchange performance in a Virtual Machine which goes through the process step by step.

At this stage we’ve learned to get optimal performance, we need to use multiple virtual disks regardless of hypervisor, and for traditional hypervisors (ESXi & Hyper-V) we need to assign/configure multiple PVSCSI adapters and spread out virtual disks evenly across them.

Let’s say you have a VM running a monster SQL workload, and the nodes/cluster has been sized correctly and the active working set (data) resides 100% in the SSD tier or it’s an all flash cluster.

The VM is also running on AHV enjoying Turbo Mode (or ESXi/Hyper-V with four PVSCSI controllers) and you’ve added 16 vDisks and spanned your database across the vDisks, BUT you still need more performance. What can we do next?

The good news is, Nutanix has lots of way’s to scale performance so let’s look at a few of them:

  • Increase the vCPU of the Nutanix Controller VM (CVM)

This is rarely required, but it’s important to understand that Nutanix is just software running inside a VM, so simply increasing the vCPUs assigned to the CVM gives more available power to drive front end I/O as well as background cluster functionality.

The CVM automatically allows N-2 of the CVMs vCPUs to stargate (the I/O engine) which means if you add more vCPUs to the CVM, you will get more potential front and back end IO.

If your application performance is being impacted due to the local CVM being saturated (firstly, well done as this is very rare), but adding say 2 more vCPUs to the CVM may be enough to alleviate the bottleneck and give you much improved performance. I’ve seen this situation before and for the relatively low “cost” of 2 vCPUs, it can be well wroth it.

It’s important to note you can increase the vCPUs of just a single CVM, multiple CVMs or all CVMs within the cluster depending on your requirements and cluster design. e.g.: In a mixed cluster of nodes with 22c processors and 10c processors, you may move the critical VMs to the nodes with 22c processors and increase the CVM by 2vCPUs while leaving the nodes with the smaller 10c processors at default CVM size. This would deliver increased performance for the entire cluster while the most benefits would be felt on the 22c nodes.

For those interested in the Pros and Cons of the CVM and it’s use of host resources, please review: Cost vs Reward for the Nutanix Controller VM (CVM)

  • Increase the vRAM of the Nutanix Controller VM (CVM)

Increasing the CVMs RAM is another quick and easy way to improve performance. The two main reasons adding RAM can improve performance is because part of the CVMs RAM acts as a read cache so depending on your application and dataset size, the additional read cache can make a huge difference.

The second reason is for CVM RAM allows additional medusa (metadata) cache which helps minimise read latency.

If you look at http://CVM_IP:2009/cache_stats (example below) and your “Range Cache Hit %” is 50%, Then you’re getting very good cache hits, whereas if it was just 5% then more RAM may result in significantly better read performance depending on the working set size.

The other critical factor for performance is the medusa cache. We want to see as close to 100% as possible for the “VDisk block map Cache Hit %” & “Extent group id map Cache Hit %”.

StargateCacheStats

The above is an example of a system which has an optimal CVM RAM size for the working set as the Range Cache Hit % of 50% and the “VDisk block map Cache Hit %” & “Extent group id map Cache Hit %” are both sitting consistently at 100%.

The above cache and medusa hit rates are from a test cluster and it was achieving the following performance for a database checksum task (100% read).

ExamplePerformanceWith100%MedusaHitRate

The key here is the very low read latency which peaked at 0.35ms and sustained around 0.18ms over the course of several hours.

Signs of insufficient CVM RAM can be inconsistent read latency so if you’re observing this issue, review http://CVM_IP:2009/cache_stats and contact support for advise on CVM RAM sizing.

Note: There is no “harm” in adding more CVM RAM as long as the CVM is sized within a NUMA node to ensure memory performance remains optimal, the only impact is less available RAM for other Virtual Machines.

Let’s recap where we’re at:

The VM is on AHV enjoying Turbo Mode (or ESXi/Hyper-V with four PVSCSI controllers) with 16 vDisks and spanned your database across the vDisks. We’ve increased the CVM vCPUs and verified we have 100% hit rates for medusa and respectable 50% read cache hit rates, but we still need more performance, what else can we do?

  • Add storage only nodes

The benefits of adding storage only nodes, especially to a busy cluster is not only immediate but obvious when we look at the total IOPS, read and write latency.

If you’ve not read my post titled “Scale out performance testing with Nutanix Storage Only Nodes” I will quickly recap it for you, but I recommend reading the full article.

In short, I ran an MS Exchange Jetstress workload on 4 VMs on an optimally configured 4 node hybrid (SSD+SATA) cluster and achieved the following results.

Jetstress4NodesSummary

Observations from the baseline test:

  1. We achieved the desired >1000 IOPS per VM
  2. Performance was consistent across all Jetstress instances
  3. Log writes were in the 1ms range as they were serviced by the ADSF Oplog (persistent write buffer)
  4. Database reads were on average just under 10ms which is well below the Microsoft recommended 20ms
  5. The Database creation time averaged 2hrs 24mins
  6. The duplication of 3 databases averaged 4hrs 17mins
  7. The database checksum took on average around 38mins

I then added 4 more nodes to the cluster and without making any changes to the Jetstress, virtual machine/s or the cluster configuration and the IOPS jumped by 2x!!

The results for each of the four Jetstress VMs are shown below including the average across the VMs for each of the difference metrics.

Jetstress8NodesSummary

In summary adding the 4 storage only nodes:

  • Achieved IOPS jumped by almost 2x
  • Log writes average latency was lower by 13%
  • Database write latency dropped by >20%
  • Database read latency dropped by almost 2x
  • The Database creation time was just under 15 mins faster
  • The duplication of 3 databases improved by almost 35 mins
  • The database checksum was 40 seconds faster.

As we can see from these results, adding storage only nodes can significantly increase the performance without any tuning. Had I tuned the Jetstress configuration, much higher performance and potentially lower read/write latency could have also been achieved.

In short, adding storage only nodes is a quick win for performance with the added advantage of increasing the resiliency and capacity of the cluster.

So we’ve now achieved much higher performance for our workload thanks to a combination of optimally configured VM, CVM and the addition of storage only nodes.

If at this stage you’re still not achieving the performance you require, you’re in the 1% where we may need to utilise Acropolis Block Service (ABS) to further improve performance.

  • Acropolis Block Services (ABS)

ABS was announced in 2016 to address the edge use cases as customers wanted to make Nutanix the standard platform for their datacenters, however they have not been able to realise this vision due to a number of reasons including:

  • The desire/requirement to re-use existing servers
  • Applications which are not virtual (for many reasons, mostly political)
  • Performance / Scalability of externally connected servers
  • Complexity including operational considerations of external iSCSI

For more detailed information about the release please review: What’s .NEXT 2016 – Acropolis Block Services (ABS)

ABS works by using In-guest iSCSI to present vDisks direct to the Guest OS. The vDisks are then automatically load balanced across the entire Nutanix cluster to provide optimal performance.

The below tweet answers the FAQ around how distributed is the workload when using ABS. As we see below, a 4 node cluster uses 4 paths and when the cluster is expanded to 8 nodes ABS automatically (and almost instantly) expands to use 8 paths (or CVMs).

The downside of ABS is the loss of data locality, but if we can’t have data locality, the next best thing is a highly scalable, resilient and dynamic distributed storage fabric.

ABS can scale performance in a linear manner which is only limited by the network bandwidth and number of nodes to drive the IO, so a physical server with say 100GB NICs and a cluster of 32 nodes would produce ridiculous levels of performance in the multi-millions of IOPS range.

The In-guest iSCSI setup is also very simple, just set the iSCSI Target as the Nutanix Cluster IP and the load balancing is dynamically calculated, when the cluster size increases, the vDisks are automatically balanced across the new nodes without user intervention, the same is true for node removals, maintenance, upgrades, failures etc. Everything is managed automatically so ABS is a very simple iSCSI implementation for admins.

Summary:

Nutanix provides excellent scalability for Virtual Machines and provides ABS for niche workloads which may require more performance than a single node can offer.

Up next, Part 4 where we cover the latest and most exciting development for scaling storage Performance for Monster VMs.

Back to the Scalability, Resiliency and Performance Index.

Nutanix Scalability – Part 2 – Compute (CPU/RAM)

Following on from Part 1 of the Scalability series where we discussed how Nutanix can scale storage capacity seperate to compute, the next obvious topic is to talk about scaling CPU and Memory resources at both the workload and cluster level.

Let’s first recap the problems with scaling compute with traditional shared storage.

HCInotHCI

Yuk! That looks like old school 3-tier stuff to me!

Non HCI workloads on compute only nodes would therefore:

  • Be running in the same setup as traditional 3-tier infrastructure
  • Have different performance than HCI based workloads
  • Loose the advantage of having compute + storage close together
  • Increase dependency on Network
  • Impact network utilization of HCI node/s
  • Impact benefits of HCI for the native HCI workloads and much more.

The industry has accepted HCI as they way of the future and while adding compute only nodes might sound nice at a high level, its just re-introducing the classic 3-tier complexity and problems of the past when if we review the actual requirements it’s very rare to see a Nutanix node have insufficient resources when sized/configured correctly.

Customers are often surprised when they show me their workloads and I don’t seem surprised by the CPU/RAM or storage IO or capacity requirements. I can’t tell you how many times I’ve made statements like “You’re applications requirements are not that high, I’ve seen much worse!”.

Examples of scaling compute with Nutanix

Example 1: Scaling up a Virtual Machines compute resources:

SQL/Oracle DBA: Our application is growing/running slowly, We need more CPU/RAM!

Nutanix: You have several options:

a) Scale up the virtual machine’s vCPUs and vRAM to match the size of the NUMA node.
b) Scale up the virtual machine’s vCPUs and vRAM to be the same number of pCore’s as the host minus the Nutanix CVM vCPUs and do the same with the RAM.

The first option is the optimal as it will ensure maximum memory performance as the CPU will be assessing memory within the NUMA boundary, however the second option is still viable and for applications such as SQL, the impact of insufficient memory can be higher than the penalty of crossing a NUMA boundary.

BUT MY WORKLOAD IS UNIQUE, IT NEEDS A PHYSICAL SERVER!!

Despite hearing these type of statements by prospective and existing customers, Very few workloads actually need more CPU/RAM that a modern Nutanix (or OEM/Software only) node can provide even if you remove resources for the Controller VM (CVM). I find that it’s usually only a perceived requirement for physical servers and in reality, a reasonably sized VM on a standard node will deliver the desired business outcome/s comfortably.

Currently Nutanix NX nodes support Intel Platinum 8180 processors which have 28 physical cores @ 2.5 GHz per socket for a total of 56 physical cores (112 threads).

If you had say an existing physical server using a fairly modern Intel Broadwell E5-2699 v4 with dual 22 physical core processors, you have a total SPECint_rate of 1760 or 40 per core.

Compare that to the Intel Platinum 8180 processor and you have a total SPECint_rate of 2720 or 48.5 per core.

This is an increase per core of 21.25%.

So if you’re moving from that physical server using Intel Broadwell E5-2699 v4 CPUs (44 cores) and you move that workload to Nutanix with ZERO CPU overcommitment (vCPU:pCore ratio 1:1) using the Intel Platinum 8180 processor, assuming we reserve 8 pCores for the CVM we still have 48 pCores for the SQL VM.

That’s a SpecIntRate of 2328 which is higher than the physical server using all cores.

That’s over 32% more CPU performance for the Virtual Machine compared to the dedicated physical server.

The reality is the Nutanix CVM and Acropolis Distributed Storage Fabric (ADSF) provides high performance, low latency storage which also drives further CPU efficiency by eliminating CPU WAIT (CPU cycles wasted waiting for I/O to complete).

As you can see from this simple example, a Virtual Machine on Nutanix can easily replace even a modern physical server and even provide better performance with only one generation newer CPU. Think about how your 3-5 year old physical servers will feel when they jump multiple generations of CPU and get scale out flash based storage.

Example 2: A VM (genuinely) needs more CPU/RAM than Nutanix nodes have.

SQL/Oracle DBA: Our application/s is needs more CPU/RAM than our biggest node/s can provide.

Nutanix: You have several options:

a) Purchase one or more larger node (e.g.: NX-8035-G6 w/ Intel Platinum or Gold Processors, add them to the existing cluster and live migrate your VM/s to that/those nodes. Use affinity rules to keep critical VMs on the highest performance nodes.

Nutanix supports mixing different hardware types/generations in the same cluster and this can be a preferred option over creating a dedicated cluster for several reasons.

  • Larger clusters provide more targets for replication traffic (i.e.: RF2 or RF3) meaning lower average write latency
  • Larger clusters provide higher resiliency as they can potentially tolerate more failures and rebuild follow a drive/node or nodes failing faster.
  • Larger clusters help ensure the impact of a failure is lower as a lower percentage of cluster resources are lost

b) Purchase one or more larger node (e.g.: NX-8035-G6 w/ Intel Platinum or Gold Processors, and create a new cluster and migrate your VM/s to that cluster.

A dedicated cluster may sound attractive, but in most cases I recommend mix workload clusters as they ultimately provide higher performance, resiliency and flexibility.

c) Scale out your workloads

Applications like MS Exchange, MS SQL and Oracle RAC can (and arguably should) be scaled out rather than scaled up as doing so provides increased performance, resiliency and reduces overall infrastructure costs (e.g.: More cheaper/smaller processors can be used as opposed to premium processors like Intel Platinum series).

One large VM hosting dozens of databases is rarely a good idea, so scale out and run more VMs, distributed across your Nutanix cluster and spread the workload across all the VMs.

For 99% of workloads, I do not see the real world value of compute only nodes. But there are always exceptions to every rule.

Potential Exceptions:

Example 3: Re-using existing hardware

SQL DBA: I love my Nutanix gear (duh!) but I have some physical servers which wont be end of life for 12 months, can I continue using them with Nutanix?

Nutanix: We have several options:

a) If the hardware is on our Software-only hardware compatibility list (HCL), it’s possible you can purchase SW-only licenses and deploy Nutanix on your existing hardware.

b) Use Nutanix Acropolis Block Services (ABS) to provide highly available scale out storage to your physical server via iSCSI.

ABS was released in 2015 and supports SCSI-3 persistent reservations for shared storage-based Windows clusters, which are commonly used with Microsoft SQL Server and clustered file servers.

ABS supports several use cases, including:

  • iSCSI for Microsoft Exchange Server.
  • Shared storage for Linux-based clusters
  • Windows Server Failover Clustering (WSFC).
  • SCSI-3 persistent reservations for shared storage-based Windows clusters
  • Shared storage for Oracle RAC environments.
  • Bare-metal environments.

Therefore ABS allows you to re-use your existing hardware to maximise your return on investment (ROI) while getting the benefits of ADSF. Once the hardware is end of life, the storage already on the Nutanix cluster can be quickly presented to a VM so the workload will benefit from the full Nutanix HCI experience.

Future Capabilities:

In late 2017, Nutanix announced Nutanix Acropolis Compute Cloud (AC2) which will provide the ability to have true compute-only nodes in a Nutanix cluster as shown below.

I reluctantly mention this upcoming capability because I do not want to see customers go back to a 3-teir model or think that HCI isn’t the way forward because it is. That’s not what compute-only is about.

This capability is specifically designed to work around the niche circumstances where a software vendor such as Oracle, are extorting customers from a licensing perspective and it’s desirable to maximise the CPU cores the application can use.

Let me have a quick rant and put an end to the nonsense before it gets out of hand:

IT IS NOT FOR GENERAL VM USE!

NO ITS NOT FOR PERFORMANCE REASONS.

NO NUTANIX IS NOT MOVING BACK TO A 3-TIER COMPUTE+STORAGE MODEL.

HCI WITH NUTANIX IS STILL THE WAY FORWARD

Summary:

Nutanix provides excellent scalability at the CPU/RAM level for both virtual and physical workloads. In rare circumstances where physical servers are a real (or likely just a perceived) requirement, ABS can be used while Nutanix will soon also provide Compute-only for AHV customers to ensure licensing value is maximised for those rare cases.

Back to the Scalability, Resiliency and Performance Index.