vBrownbag – vForum Sydney 2013 – View Composer for Array Integration (VCAI)

Recently I presented a short vBrownbag on the topic of View Composer for Array Integration (VCAI) with a focus on how VCAI benefits Horizon View environments, Storage Protocol choice for Horizon View deployments and how Nutanix leverage’s VCAI to optimize Horizon View Deployments.

The video is available here – View Composer for Array Integration VCAI) & Nutanix

 

 

Scaling problems with traditional shared storage

At VMware vForum Sydney this week I presented “Taking vSphere to the next level with converged infrastructure”.

Firstly, I wanted to thank everyone who attended the session, it was a great turnout and during the Q&A there were a ton of great questions.

One part of the presentation I got a lot of feedback on was when I spoke about Performance and Scaling and how this is a major issue with traditional shared storage.

So for those who couldn’t attend the session, I decided to create this post.

So lets start with a traditional environment with two VMware ESXi hosts, connected via FC or IP to a Storage array. In this example the storage controllers have a combined capability of 100K IOPS.

50kIOPS

As we have two (2) ESXi hosts, if we divide the performance capabilities of the storage controllers between the two hosts we get 50K IOPS per node.

This is an example of what I have typically seen in customer sites, and day 1, and performance normally meets the customers requirements.

As environments tend to grow over time, the most common thing to expand is the compute layer, so the below shows what happens when a third ESXi host is added to the cluster, and connected to the SAN.

33KIOPS

The 100K IOPS is now divided by 3, and each ESXi host now has 33K IOPS.

This isn’t really what customers expect when they add additional servers to an environment, but in reality, the storage performance is further divided between ESXi hosts and results in less IOPS per host in the best case scenario. Worst case scenario is the additional workloads on the third host create contention, and each host may have even less IOPS available to it.

But wait, there’s more!

What happens when we add a forth host? We further reduce the storage performance per ESXi host to 25K IOPS as shown below, which is HALF the original performance.

25KIOPS

At this stage, the customers performance is generally significantly impacted, and there is no easy or cost effective resolution to the problem.

….. and when we add a fifth host? We continue to reduce the storage performance per ESXi host to 20K IOPS which is less than half its original performance.

20KIOPS

So at this stage, some of you may be thinking, “yeah yeah, but I would also scale my storage by adding disk shelves.”

So lets add a disk shelf and see what happens.

20KIOPSAddDiskShelf

We still only have 100K IOPS capable storage controllers, so we don’t get any additional IOPS to our ESXi hosts, the result of adding the additional disk shelf is REDUCED performance per GB!

Make sure when your looking at implementing, upgrading or replacing your storage solution that it can actually scale both performance (IOPS/throughput) AND capacity in a linear fashion,otherwise your environment will to some extent be impacted by what I have explained above. The only ways to avoid the above is to oversize your storage day 1, but even if you do this, over time your environment will appear to become slower (and your CAPEX will be very high).

Also, consider the scaling increments, as a solutions ability to scale should not require you to replace controllers or disks, or have a maximum number of controllers in the cluster. it also should scale in both small, medium and large increments depending on the requirements of the customer.

This is why I believe scale out shared nothing architecture will be the architecture of the future and it has already been proven by the likes of Google, Facebook and Twitter, and now brought to market by Nutanix.

Traditional storage, no matter how intelligent does not scale linearly or granularly enough. This results in complexity in architecture of storage solutions for environments which grow over time and lead to customers spending more money up front when the investment may not be realised for 2-5 years.

I’d prefer to be able to Start small with as little as 3 nodes, and scale one node at a time (regardless of node model ie: NX1000 , NX3000 , NX6000) to meet my customers requirements and never have to replace hardware just to get more performance or capacity.

Here is a summary of the Nutanix scaling capabilities, where you can scale Compute heavy, storage heavy or a mix of both as required.

ScaingSolution

Competition Example Architectural Decision Entry 2 – Use of RDMs in Standard IaaS Clusters

Name: Chris Jones
Title: Virtualization Architect
Twitter: @cpjones44
Profile: VCP5 / VCAP5-DCD

Problem Statement

VMs require more than 1.9TB in a single disk. The existing virtual environment has LUNs provisioned that are 2TB in size. As these VMs have virtual data disks (VMDKs) that are > 1.9TB in size, alarms are being triggered by the infrastructure monitoring solution and raising Incident tickets to the Virtual Infrastructure support queue.

Assumptions

1. Data within the OSI must reside within the VM and not on some kind of IP based store (like a NAS share).

2. vSphere datastores are presented through FC and not IP based stores (ie. NFS).

3. vSphere Hypervisor is ESXi 4.1.

4. There is no requirement for the VMs to be performing SAN specific functionality or running SCSI target-based software.

Constraints

1. The implemented monitoring solution cannot be customised with triggers and monitoring policies for individual objects within the environment (ie. having one monitoring policy per individual or sub-group of datastores).

2. Maximum vSphere datastore size in version 4.1 is 2TB minus 512 bytes.

3. Unable to upgrade beyond ESXi 4.1 Update 3.

Motivation

1. Reduce the number of incident tickets being raised, thus improving SLA posture.

2. Reduce the requirement to span single Windows logical volumes across multiple VMDKs.

Architectural Decision

Turn the disk into an RDM (Virtual Compatibility Mode) to remove the level of monitoring from the vSphere layer.

Alternatives

1. Create smaller VMDKs (ie. 1-1.5TB disks) and create a RAID0 volume within the guest OS.

2. Change the level of alerting so that tickets are not raised for alerts that trigger beyond 90%.

3. Turn the disk into an RDM to remove the level of monitoring from the vSphere layer.

4. Thin Provision the virtual disks

5. Store the data within the guest on some kind of IP based storage (NAS/iSCSI target).

Justification

1. Option 5 goes against the assumption that data must be local to the VM, so was ruled out.

2. Whilst thin provisioning (Option 4) is an attractive solution, this option is ruled out based on a wider infrastructure decision to thick provision all disks in the environment to reduce the risk to datastores filling up and critical business VMs stopping.

3. Option 1 via smaller VMDKs spread across multiple vSphere datastores will result in these alerts disappearing, however it will create issues when trying to execute a DR recovery for either the individual disks (Active/Passive) or the whole VM (Active/Cold). All that’s needed is for one VMDK not to be replicated and the whole Windows volume will be corrupted, or for the VMDKs to be mounted in the wrong order. Multiple VMDKs to one Windows volume also complicates the recovery of snapshot array-based backups (eg. via SMVI or NetBackup).

4. Option 2 goes against the constraint of the infrastructure monitoring solution not being able to creating individual alerting policies for either a single or sub-group of datastores in the inventory. Should individualised policies be created, we would need to ensure that the affected VMDKs that consume 90-95% of a datastore remain on that datastore as moving from one to another (ie. from Tier 2 to Tier 1) will require a change to the monitoring that has been configured. At this stage, the monitoring solution has no way to track these customised policies, which is most of the reason why global environment wide policies exist.

5. Option 3 and the use of RDMs in Virtual Compatibility Mode will allow the VM to benefit from the features of VMFS, such as advanced file locking for data protection and vSphere snapshotting. The use of RDMs will also allow for VMs to be managed by DRS (ie. can be vMotion’ed) and protected by vSphere HA.

Implications

  1. The RDM mapping will need to be recorded clearly to avoid the lengthy process of discovering from scratch what physical LUN is presented to the virtual machine.

An example of how to map these will be to:

A)    Record the name of the VM that has the RDM.

B)    Record the NAA number of the physical LUN(s) that are presented to the VM.

C)    Record the virtual device node on the virtual disk controller as to where the RDM is mounted.

D)    Record the Windows drive letter that this RDM is mounted to.

2. Additional paths will be consumed, reducing the total number of vSphere datastores that can be presented to the cluster.

Back to Competition Main Page or Competition Submissions