Competition Example Architectural Decision Entry 3 – Scalable network architecture for VXLAN

Name: Prasenjit Sarkar
Title: Senior Member of Technical Staff
Company: VMware
Twitter: @stretchcloud
Profile: VCAP-DCD4/5,VCAP-DCA4/5,VCAP-CIA,vExpert 2012/2013

Problem Statement

You are moving towards scalable network architecture for your large scale Virtualized Datacenter and want to configure VXLAN in your environment. You want to make sure that Teaming Policy for VXLAN transport is configured optimally for better performance and reduce operational complexity around it.

Assumptions

1. vSphere 5.1 or greater
2. vCloud Networking & Security 5.1 or greater
3. Core & Edge Network topology is in place

Constraints

1. Should have switches that support Static Etherchannel or LACP (Dynamic Etherchannel)
2. Have to use only IP Hash Load balancing method if using vSphere 5.1
3. Cannot use Beacon Probing as Failure Detection mechanism

Motivation

1. Optimize performance for VXLAN

2. Reduce complexity where possible

3. Choosing best teaming policy for VXLAN Traffic for future scalability

Architectural Decision

LACP – Passive Mode will be chosen as the teaming policy for the VXLAN Transport.

At least two or more physical links will be aggregated using LACP in the upstream Edge switches.

Two Edge switches will be connected to each other.

ESXi host will be cross connected to these two Physical upstream switches for forming a LACP group.

LACP will be configured in Passive mode in Edge switches so that the participating ports responds to the LACP packets that it receives but does not initiate LACP negotiation.

Alternatives

1. Use LACP – Active Mode and make sure you are using IP Hash algorithm for the load balancing in your vDS if using vSphere 5.1.

2. Use LACP – Active Mode and use any of the 22 available load balancing algorithm in your vDS if using vSphere 5.5.

3. Use LACP – Active Mode and use Cisco Nexus 1000v virtual switch and use any of the 19 available load balancing algorithm.

4. Use Static Etherchannel and make sure you are using IP Hash *Only* algorithm in your vDS.

5. If using Failover then have at least one 10G NIC to handle the VXLAN traffic.

Justification

1. Fail Over teaming policy for VXLAN vmkernel NIC uses only one uplink for all VXLAN traffic. Although redundancy is available via the standby link, all available bandwidth is not used.
2. Static Etherchannel requires IP Hash Load Balancing be configured on the switching infrastructure, which uses a hashing algorithm based on source and destination IP address to determine which host uplink egress traffic should be routed through.

3. Static Etherchannel and IP Hash Load Balancing is technically very complex to implement and has a number of prerequisites and limitations, such as, you can’t use beacon probing, you can’t configure standby or unused link etc.

4. Static Etherchannel does not do pre check both the terminating ends before forming the Channel Group. So, if there are issues within two ends then traffic will never pass and vSphere will not see any acknowledgement back in it’s Distributed Switches

5. Active LACP mode places a port into an active negotiating state, in which the port initiates negotiations with other ports by sending LACP packets. If using vSphere prior to 5.5 where only IP Hash algorithm is supported then LACP will not pass any traffic if vSphere uses any other algorithm other than IP Hash (such as Virtual Port ID)

6. The operational complexity is reduced

7. If using vSphere 5.5 then can use 22 different algorithm for load balancing and also Beacon Probing can be used for Failure Detection.

Implications

1. Initial setup has a small amount of additional complexity however this is a one time task (Set & Forget)

2. Only IP Hash algorithm is supported if using vSphere 5.1

3. Only one LAG can be supported for the entire vSphere Distributed Switches if using vSphere 5.1

4. IP Hash calculation if not done manually by taking VM’s vNIC and Physical NIC then there is no guarantee that it will balance the traffic across physical links

Back to Competition Main Page or Competition Submissions

Native NFS Snapshots (VAAI) w/ VMware View Composer (View 5.1)

Following my post on Netapp Edge VSA and the Rapid Clone utility, I thought it was obvious to write a piece on the new VAAI functionality in VMware View 5.1 which allows the use of Netapp Native NFS snapshots (VAAI) for VMware View Composer linked clone deployments.

This feature is really the missing piece of the puzzle as the Rapid Clone Utility (RCU) could deploy large numbers of desktops very quickly, however it could only do manual pools which may have been a pain point for some customers.

So lets jump right in.

To take advantage of native NFS snapshot functionality within VAAI you need to install the NFS VAAI Plugin.

The official documentation on the plugin can be found at here.

The easiest way however, is too download the offline bundle from now.netapp.com and use the VSC plugin to complete the installation, see below for instructions.

The below screenshots are designed to be visual aids to support the above written instructions.

The below is the VSC plugin main screen.

Click the “Tools” option on the left hand side

Click the “Install on host” button, then select the hosts you want to install the plugin on and press “Install”

Select “Yes” to confirm the installation

The installation will begin as shown below. The installation was not super fast for me, so be patient.

After around 3 mins (in my lab anyway), it should complete, following which, reboot your host/s.

The easiest way to confirm if the installation was successful is to check the “Hardware Accelerated” column (on the far right) for your datastores. Ensure it is now showing “Supported” as per the below example.

If for some reason it still shows “Not Supported”, reboot your host, and if that doesn’t work, reinstall the plugin.

Now that we have the plugin installed, its time to get into VMware View Administrator.

Launch the web interface to your connection broker, and login.

You should see similar to the below after logging in.

Note: My system health shows some errors due to not have signed certificates, this will not impact the functionality.

Now, this article assumes your environment is already configured with a vCenter and View Composer server, like the below. If you do not have vCenter and View Composer configured, this article does not cover these steps.

The below shows the VMware View administrator console, To create a Pool (with or without Native NFS Snapshots) , we use the “Add” button shown below.

In the Pool definitions section, we start at the “Type” menu.

For this example, to use View Composer, we select the “Automated Pool” option and press “Next”.

The “User Assignment” screen gives us two (2) options, both options can leverage the Native NFS snapshots, but in this case, I have selected “Floating”.

In the “vCenter server” menu, Select “View Composer linked clones” and press “Next”.

We are now in the “Settings” section of the Add Pool wizard, Here we set the ID and Display Name, for this example, the ID and display name are both set to “W7TestPool”. After you set your ID and Display name, press “Next”.

In Pool settings,  I have chosen to leave everything default for this example. In the real world, each of these settings should be carefully considered.

In the “Provisioning settings” menu the two (2) main things to do is to set the naming pattern, which should be a logical name for your environment, followed by {n:fixed=3}, this basically results in three (3) digits after your chosen name so you can support VMs 001 through 999.

Then we select the maximum number of desktops and the number of spare desktops.

In this example I want to provision all desktops up-front to demonstrate the speed of deployment.

In a production environment this would not generally be the most efficient setting.

The “View Composer” disks menu allows us to configure “disposable disks”, for this example these are not required as no users will be using the desktops I am deploying as this is a test lab. However, in a production environment this is an option you need to carefully consider.

The “Storage Optimization” menu allows both persistent and replica disks to be separated from OS disks. Again, this is something to carefully consider in your production environments, but is not relevant to this example. As such neither option it used.

Now we select the Parent VM & Snapshot, In this case, I am using a Windows 7 VM which I have prepared. There is nothing special about this image, it is just a bare Windows 7 installation patched using Windows update, nothing more.

For this example, I am using my “MgmtCluster”, which is  just a cluster with my physical ESXi 5.0 host.

The datastores option is important, to make full use of the Native NFS Snapshots, the Parent VM should be in the same NFS datastore.

I have selected “NetappEdge_Vol1” as this is where my Parent VM resides, You have the option to set the “Storage Overcommitment” as shown below, however this is not relevant as we’re using the Native NFS Snapshots option later in the wizard.

The below shows all options are completed, now we hit “Next”.

Here we see we have the option to “Use native NFS snapshots (VAAI), if this is greyed out, you may have an issue with the plugin install OR the datastore you have selected is not on your Netapp Edge/FAS or IBM N-Series controller.

We can also use Host caching (CBRC) which will generally provide good performance, as such I have left it enabled.

In the guest customization section, we can set an AD container where we want the linked clones to reside, in a production environment you should use this feature, but for this demonstration its not relevant.

You can also use QuickPrep or Sysprep – Each has Pros & Cons, but both work with the Native NFS snapshots.

Now we’re done, so all we need to do is hit “Finish”.

Now, I have included the below screen shot of the datastores prior to the Linked clones being deployed as a baseline to show there is 31.50GB Free on “NetappEdge_Vol1” which will be used for this demonstration.

Having completed the “Add Pool” wizard, after a short delay, the initial clone of the master VM will start, you will see a task similar to the below appear.

We can also see from the above the first two clones took just 20 seconds.

Now see the next screen shot, where the tenth VM is powering on, thus confirming the storage (or cloning) part of the process is complete. Note the completed time of 20:53:39, which is 20:50:12 , this means all 10 VMs we’re cloned, and registered to vCenter is just 3mins 27seconds (or 20.7 seconds per 10Gb VM).

At this stage the VMs are all booting up, and customizing etc before registering with the connection broker.

This step in the process is largely dependent on the amount of compute in your cluster and the storage performance (mostly from a read) perspective. As I have only a single host with my servers / storage and desktops running on the same host, the time it takes to complete this step will be longer than a production environment.

In conclusion the new functionality with Native NFS Snapshots (VAAI) clearly demonstrates a significant step forward to improving desktop provisioning times w/ View Composer. It also basically removes the compute and I/O impact on your vSphere cluster and storage array.

The performance appears to be similar to the performance of the Rapid Clone Utility (RCU) without the restriction of having to use “Manual Pools”.

As such I would encourage anyone looking at VDI solutions to consider this technology, as it has a number of important benefits over traditional “dumb disk”.

Netapp Edge VSA – Rapid Cloning Utility (RCU)

At VMWorld this year Vaughn Stewart from Netapp presented SS1011 &  SPO3339 where he spoke about the release of the Netapp Edge Virtual storage appliance.

This got me quite excited, for many reasons, but the one I will discuss today is the use of the VSA for my test lab. Using the VSA will allow me to test enterprise features, without requiring expensive and noisy equipment in my lab. Sounds good hey!

In this article I will demonstrate Netapp’s Rapid Clone Utility and how it can deploy large numbers of VMs very rapidly, without a storage (I/O and capacity) or compute (Ghz & Ram) overhead.

So before we begin, if your interested in what I have in my lab, please see “My Lab” blog post.

The only difference is I now have the Netapp Edge VSA serving NFS storage to my physical ESXi host , named “pesxi01” as well as a number of nested ESXi hosts which is not relevant to this discussion.

So, to get started, I created a thin provisioned Windows 7 VM with all up to date patches etc which came out to near as makes no difference 10gb in szie. (See below)

This VM was hosted on a NFS datastore called “NetappEdge_Vol1” which is a thin provisioned volume from the Netapp Edge VSA.

Here is a screen shot of the datastores, the highlighted one being “NetappEdge_Vol1” which I will use for this test.

Note: The total capacity of the datastore is 47.50GB and with my “W7 Test VM” at around 10gb, Free space is 37.95 GB.

Now I am going to demonstrate the Netapp RCU (Rapid Clone Utility) which is provided within the Netapp Virtual Storage Console (VSC). For this test I am using version 4.0 although this functionality has been available in earlier versions.

Here is a screen shot of the vSphere client home screen showing the Netapp VSC Plugin under “Solutions and Applications”.

So lets get started, I will now right click the “W7 Test VM” in vSphere Client, and select the “Netapp” option (provided by VSC) , go to “Provisioning and Cloning” then select “Create Rapid Clones” as is shown in the screen shot below.

Now the “Create Rapid Clones” Wizard appears (see below). All we need to do here is select the “Target Storage Controller” which in my case, is the Netapp Edge VSA (although this could be a physical FAS or IBM N-Series array) then hit “next”.

We then choose what Cluster (or host) the cloned VMs will be hosted, in my case, I used my “MgmtCluster” which has my physical IBM x3850 M2 server, called “pesxi01”. Then we hit next.

We now choose the disk format, in my case, in my lab all VMs are thin provisioned, so selecting “Same format as source” will make the clones Thin Provisioned as well.

Note: No matter which option you choose, it will no effect the speed or disk space usage for this rapid cloning process.

Now we have to select how vCPUs / vRAM / The number of clones we want, along with clone name, starting number etc.

In this case, I choose, 1 vCPU , 1GB vRAM, 100 clones and a clone name of “newclone” with a starting clone number of 001.

The “sample clone names” section demonstrates what the VM names will look like once deployed.

The RCU utility also allows you to import the VMs into a connection broker (like VMware View, as well as other vendors) but for the sake of this post, I wont do this as I don’t have the compute power to actually power on all 100 VMs in my lab.

(I will be doing a number of articles showing various Netapp integration with VMware View in future including RCU and VAAI NFS offloading for Linked clones)

Now we hit “next”.

We now select what datastore we want to store the clones in.

Note: I have selected “NetappEdge_Vol1” which has a total capacity of 47.50gb with only 37.95gb free space.

At this point, you may be thinking, there isn’t enough capacity in the datastore for the 100 clones?  Never fear, I have this covered.

We are now at the end of the RCU wizard, so lets hit “Apply” and see what happens.

So the “Netapp rapid clone virtual machine” task has started, Note the start time of 12:11:08.

In the below screen shot, we can see a number of tasks have started, the first we have discussed (above), so lets talk about the “Netapp initial copy of image” task. This task creates the first clone of the VM.

This task took just one (1) second!  Cool right!?!

Next the “Netapp FlexClone virtual disk” task clones the VMDK of the master VM 100 times into a folder within the datatstore which has “RCU” in the label. This task started at 12:11:20 and completed at 12:19:13, a total of 7mins and 53 seconds.

So that’s just 4.73 seconds per clone.

Next the “Create Virtual Machine” and “Clone Virtual Machine” tasks begin, which I will show more details of shortly.

The below is a screen shot showing what happens following the “Netapp FlexClone virtual disk”.

Note: We observed earlier this takes just 4.73 seconds per clone, even in a test lab!

What we now observe is each clone is created in two (2) steps, the first being the “Clone Virtual Machine” task, which as shown below takes only a few seconds (<5secs) followed by a “Reconfigure virtual machine” task which renames the VM and also only takes a few seconds.

So the above steps continue until it completes for each of the 100 clones.

So how long did it take from start to finish?

I started the task at 12:11:08 and it completed (with all 100 clones registered in vCenter ready to power on) by 12:36:35 – A total of only 25 mins & 27 seconds!

That’s only 15.27 seconds per 10gb VM (average).

Now, at this point, most people don’t believe this is possible, so let me show you what the datastore looks like.

and the below shows the size of one of the clones chosen at random

Now for the really really cool part.

Here is the “NetappEdge_Vol1” datastore after the cloning process.

Note: The free space is 36.72Gb – So if we compare that with the “37.95Gb” (shown earlier in this article) prior to the cloning process, for our 100 cloned VMs of ~10Gb each, we only consumed an additional 1.23Gb of disk space.

It is important to note there is no need to run deduplication over  the underlying volume which hosts this NFS datastore as the VMs have been cloned in a intelligent manner which is already deduplicated. This is important as deduplication processing requires a lot of CPU resources from the array (regardless of storage vendor).

By intelligently cloning using RCU, the array does not need to waste CPU resources doing deduplication on this volume, and therefore has more resources to serve your production workloads eg: Your vSphere clusters.

The below are some stats I collected to demonstrate this process is not in any way disk intensive, and therefore does not impact your ESXi host (or cluster) , the storage area network (being FC / iSCSI / NFS etc) or the underlying disk array.

So I observed a maximum latency of 1ms during the cloning process.

Next we see the disk performance for the during of the process. So we have a peak of 35,000KBps

It is important to note, both the above and below graphs show performance for the ESXi host (pESXi01) so this includes all VMs running on the host, including Netapp Edge VSA, vCenter, a domain controller and numerous other VMs.

So why am I blogging about a storage feature like Netapp’s RCU? Simple, as a Virtualisation architect I need to consider all elements of a solution (like VMware / Storage / Network) to ensure I deliver a solution which meets/exceeds the customers expectations.

One major problem in a lot of existing environments, and designs I review,  is a lot of architects fail to consider the impact of features/functionality (like cloning VMs) on the production environment. This can result in cloning tasks having to be scheduled out of business hours, or on weekends as they are too intensive to run during the day.

For example, if you have a VMware View environment and you need to clone whatever number of VMs, then if you don’t use an intelligent feature like RapidClone then you will be totally reliant on the speed of your array, ESXi hosts and storage network (IP or FC) to get the cloning task/s done.

Even if you have the worlds fastest array (insert you favorite vendor here), storage connectivity and the biggest and most powerful ESXi hosts the process of cloning a large number of virtual machines will still;

1. Take more time to complete than an intelligent cloning process like RCU

2. Impact the performance of your ESXi hosts and more than likley production VMs

3. Impact the performance of your storage network & array (and anything that uses it , physical or virtual).

I would also like to note, the NetappEdge VSA runs pretty much at 100% CPU usage all the time in my home lab and did so specifically during this test as I have other things using the VSA for shared storage.

I suspect if It was running it on newer processors (unlike my very old x3850 m2 CPUs) and I didn’t have other VMs using the VSA for storage (including De duplication being enabled for all volumes) this test would have completed even faster. But, I am very happy with the performance as is.

In conclusion, this test showed the new Netapp Edge VSA is a fully featured Netapp FAS array in a VM which in my opinion is awesome, as well as the RCU can easily and rapidly deploy large numbers of VMs in mins, even in a home test lab!

Fun Fact: The cloning performance I observed during this test easily out performed large production environments that I am aware of!

In Closing , I am not aware of a faster and more disk space efficient method to clone/deploy virtual machines, If you know of one, please let me know.

For more information about the Netapp Edge VSA, Click here to goto Vaughn’s blog. (VirtualStorageguy.com)