Exchange 2013 & VMware – The latest round.

Recently there has been some back and forth on social media (Twitter & Blogs) around the following article published on the Exchange Team Blog:

Troubleshooting High CPU utilization issues in Exchange 2013
The article discusses several topics including Common Configuration Issues, Over-sizing and several performance metrics which can help Exchange administrators identify and hopefully resolve performance problems.
As a result of some of the recent updated recommendations, the Exchange 2013 Server Role Requirements calculator has been updated to reflect the newer recommendations.

The calculator can be found here: Exchange 2013 Server Role Requirements Calculator.

On key recommendation from Microsoft is the maximum Exchange Mailbox (or Multi-Role) Server sizing now being:

Recommended Maximum CPU Core Count

24

Recommended Maximum Memory

96 GB

In reply to the above, VMware have released the following article.

A Stronger Case For Virtualizing Exchange Server 2013 – Think “Performance”

Personally, I don’t think the article will (or has) had the effect VMware and Virtualization architects/admins would have liked due to its negativity. As such I am going to highlight the important points in this post.

In response to VMware’s article, a well known member of the Exchange community and MVP, Tony Redmond wrote the below article.

VMware tells Microsoft that they don’t know anything about Exchange 2013 performance

In this case, even as a strong Virtualization (and VMware) evangelist, I think Tony has some valid points.

Tony mentions the following regarding the sizing tool:

Most experienced people take the output from any general-purpose sizing tool and cast a cold eye over its recommendations to put them into context with the operational and business requirements for a deployment. In other words, the recommendations are adjusted. And yes, sometimes those recommendations are adjusted to make sure that Exchange 2013 works well when deployed on virtualized servers, either Hyper-V or VMware.

I totally agree! No matter if the sizing tool was recommending more CPUs or not, experienced architects should be making decisions based on many factors, the most important being their real world experience and using the calculator as a secondary (or tertiary) input and adjusting for the underlying infrastructure, regardless of it being physical or virtualized on one of many viable hypervisors.

I also agree with the following point:

Third, wouldn’t it be better use of VMware’s time to publish well-argued and pertinent observations on how you can take the output of Microsoft sizing tools and adjust them for their platform

VMware can be considered the experts in their own technology and they should as Tony suggested publish and continue to update documentation on how best to successfully deploy Exchange on vSphere. There is no advantage in an “I told you so” style blog posts when Microsoft have actually published recommendations which to the one point of VMware’s blog that I agree with, Strengthens the case for Virtualization of Exchange.

For companies like Nutanix who have numerous Virtualization experts across multiple hypervisors including vSphere, Hyper-V and Acropolis (which is fully supported for Windows 2012 and Exchange) we should also post reference architectures and best practice guides on how to deploy Exchange successfully.

Tony also commented in reply to VMware’s post which was not favourable towards Multi-role deployments:

Combined role means a multi-role server, which I think that every reasonable expert working with Exchange has concluded is the only way to go because it increases server utilization and improves the overall resilience of any deployment. But it’s a bad thing in VMware’s world, which is a pity for them because Exchange 2016 only supports multi-role servers, so I guess they will just have to get used to that fact.

VMware’s point here is scaling out (rather than up) means in general better overall consolidation and performance. Now to be fair this is true but one cannot apply a blanket rule for all applications.

Deploying Exchange 2013 in a Multi-Role configuration is a good way to simplify as well as ensure consistent performance and resiliency in the event of a server failure as all servers can service all roles.

As Exchange is generally considered a Business Critical Application, Virtualization architects like myself should consider the recommendations for the application, the critically of the solution to the customer and make an informed recommendation for its deployment. For an app like Exchange, I think its fair to say the MS Exchange team have solid justification to recommend MSR deployments.

The impact of less scale out (i.e.: MBX + CAS) is simply a design consideration, and not an uncommon one so this in my opinion is not an issue for virtual deployments.

Key Take Aways:

1. Ensure you size within Microsoft’s revised vCPU / vRAM recommended configuration maximums (24 vCPUs / 96GB RAM)

2. When sizing Exchange on any Hypervisor, start small and scale up vCPU/vRAM requirements as required to avoid oversizing. (This is a major advantage of virtualization, so don’t be afraid to use it!)

3. Multi-Role Deployments are perfectly fine for Virtual environments. Exchange is a Business Critical Application, so treat it as such in your design phase.

Final thoughts:

With the ever increasing number of cores per socket, the case to virtualize Exchange is strengthened when considering 12-18c CPUs are not uncommon these days. As such an 18vCPU / 96GB RAM Exchange 2013 MSR VM could be virtualized with zero CPU/RAM overcommitment (as is generally recommended) while running other VMs on the same host.

This helps to remove silos within the datacenter as well as driving up utilization (without creating resource contention) which equates to lower cost/power/cooling/maintenance, the list goes on.

While Virtualization does add some complexity & cost, I would argue with newer technologies such as Nutanix which replace complex and costly SAN/NAS storage with simple to deploy and manage scale out storage these (storage) challenges are soon going to be things of the past.

Nutanix also allows customers to choose a premium feature rich hypervisor such as ESXi, or lower cost/feature solutions such as Hyper-V or Acropolis which allows customers to work with what they are comfortable with.

Acropolis for example is fully supported by Microsoft (see Microsoft SVVP) and can be deployed in <30mins with just a few clicks and its free with Nutanix, so the cost and complexity arguments against virtualization of Exchange just went out the window and Exchange would have all the benefits of things like Virtual Machine High Availability, Migration (vMotion),

Please to hear everyone’s thoughts.

Advanced Storage Performance Monitoring with Nutanix

Nutanix provides excellent performance monitoring and analytic capabilities through our HTML 5 based PRISM UI, but what if you want to delve deeper into the performance of a specific business critical application?

Nutanix also provides advanced storage performance monitoring and workload profiling through port 2009 on any CVM which shows very granular details for Virtual disks.

By default, Nutanix secures our CVM and the http://CVM_IP:2009 page is not accessible, but for advanced troubleshooting this can be enabled by using the following command.

sudo iptables -t filter -A WORLDLIST -p tcp -m tcp –dport 2009 -j ACCEPT

 

When accessing the 2009 page (which is part of the Nutanix process called “Stargate”) you will see things like Extent (In Memory Read) cache usages and hits as well as much more.

On the main 2009 page you will see a section called “Hosted VDisks” (shown below) which shows all the current VDisks (equivalent of a VMDK in ESXi) which are currently running on that node.

HostedvDisks

 

The Hosted VDisks shows high level details about the VDisk such as Outstanding Operations, capacity usage, Read/Write breakdown and how much data is in the OpLog (Persistent Write Cache).

If you need more information, you can click on the “VDisk Id” and you will get to a page titled “VDisk XXXXX Stats” where the XXXXX is the VDisk ID.

The below is some of the information which can be discovered in the VDisk Stats Page.

VDisk Working Set Size (WWS)

The working set size can be thought of as the data which you would ideally want to fit within the SSD tier of a Nutanix node, which would result in all-flash type performance.

In the below example, in the last 2mins, the VDisk had a combined (or Union) working set of 6.208GB and over the last 1hr over 111GB.

WSSExchange

 

 

VDisk Read Source

The Read Source is simply what tier of storage is servicing the VDisks IO requests. In the below example, 41% was from Extent Cache (In Memory), 7% was from the SSD Extent Store and 52% was from the SATA Extent Store.
ReadSource

 

In the above example, this was an Exchange 2013 workload where the total dataset was approx 5x the size of the SSD tier. The important point here is its not always possible to have all data in the SSD tier, but its critical to ensure consistent performance. If 90% was being served from SATA and performance was not acceptable, you could use this information to select a better node to migrate (vMotion) the VM too, or help choose to purchase a new node.

VDisk Write Destination

The Write Destination is fairly self explanatory, if its Oplog it means its Random IO and its being written to SSD, if its straight to the extent store (SSD) it means the IO is either sequential, OR in rare cases the OpLog is being bypassed if the SSD tier reached 95% full (which is generally prevented by Nutanix ILM tiering process).

WriteDestination

VDisk Write Size Distribution

The Write Size Distribution is key to determining things like the Windows Allocation Size when formatting drives as well as understanding the workload.

WriteSizeOverall

VDisk Read Size Distribution

The Read Size Distribution is similar to Write Size in that its key to determining things like the Windows Allocation Size when formatting drives as well as understanding the workload. In this case, a 64k allocation size would be ideal as both the Write (shown above) and the Read (below) are >32K and <64K 86% of the time. (Which is expected as this was an Exchange 2013 workload).

ReadSizeExchange

VDisk Write Latency

The Write Latency shows the percentage of Write I/O which are serviced within the latency ranges shown. In this case, 52% of writes are sub-millisecond. It also shows for this vDisk 1% of IO being outliers being served between 5-10ms. This is something that outside of a lab, if the outliers were a significant percentage that could be investigated to ensure the VM disk configuration (e.g.: PVSCSI and number of VMDKs) is optimal.

WriteLatency

VDisk Ops and Randomness

Here we see the number of IOPS, the Read/Write split, MB/s and the split between Random and Sequential.

vDisksOps

Summary

For any enterprise grade storage solution, it is important that performance monitoring be easy as it is with Nutanix via PRISM UI, but also to be able to quickly and easily dive deep into very granular details about a specific VM or VDisk. The above shows just a glimpse of the information which is tracked by default for all VDisks allowing customers , partners and Nutanix support to quickly and easily monitor & profile workloads.

Importantly these capabilities are hypervisor agnostic giving customers the same capabilities no matter what choice/s they make.

 

Nutanix – Improving Resiliency of Large Clusters with Erasure Coding (EC-X)

As cluster sizes increase, it is important to understand the chance of multiple concurrent failure also increases and to architect solutions to ensure resiliency is maintained.

Because scalability is one of many strengths of the Nutanix Distributed Storage Fabric, Nutanix supported multiple data protection levels (RF2 and RF3) to ensure resiliency could be scaled with cluster size.

However using RF3 results in reducing the usable capacity to approximately 33% of the formatted capacity of the drives within the cluster which means it is sometimes considered undesirable.

But because some customers require the ability to support multiple concurrent node failures without the chance of data loss or unavailability, RF3 has been required.

Enter Nutanix Erasure Coding (EC-X)!

Now lets say you have a 32 node cluster where each node has 10TB RAW.

With RF3 we would have approx 3.33TB usable per node for a total of 106.56TB in the cluster.

With EC-X enabled (assuming EC-X has been applied to all data) the usable capacity would DOUBLE to 6.66TB per node and 213.12TB for the cluster.

Here’s how it works.

For RF3, the Nutanix Distributed Storage Fabric writes and maintains three copies of each piece of data. The below shows three copies of data “A” and “B”.

RF3

The below is a simplified example of what the Nutanix Distributed Storage Fabric looks like once EC-X is applied to RF3 data.

RF3plusECX

As you can see, we now support twice the amount of data as RF3 while still having dual parity. As a result, using RF3 + EC-X gives customers using large clusters MORE usable capacity than RF2 (~50% of RAW) while providing dual parity (which enables the loss of two nodes without data loss/unavailability).

Not bad for a software only upgrade!

So what do I recommend customers who are running 32 node or larger clusters?

1. For customers running RF3 already, Consider enabling EC-X.
2. For customers running RF2, consider enabling RF3 and EC-X