Netapp Edge VSA – Rapid Cloning Utility (RCU)

At VMWorld this year Vaughn Stewart from Netapp presented SS1011 &  SPO3339 where he spoke about the release of the Netapp Edge Virtual storage appliance.

This got me quite excited, for many reasons, but the one I will discuss today is the use of the VSA for my test lab. Using the VSA will allow me to test enterprise features, without requiring expensive and noisy equipment in my lab. Sounds good hey!

In this article I will demonstrate Netapp’s Rapid Clone Utility and how it can deploy large numbers of VMs very rapidly, without a storage (I/O and capacity) or compute (Ghz & Ram) overhead.

So before we begin, if your interested in what I have in my lab, please see “My Lab” blog post.

The only difference is I now have the Netapp Edge VSA serving NFS storage to my physical ESXi host , named “pesxi01” as well as a number of nested ESXi hosts which is not relevant to this discussion.

So, to get started, I created a thin provisioned Windows 7 VM with all up to date patches etc which came out to near as makes no difference 10gb in szie. (See below)

This VM was hosted on a NFS datastore called “NetappEdge_Vol1” which is a thin provisioned volume from the Netapp Edge VSA.

Here is a screen shot of the datastores, the highlighted one being “NetappEdge_Vol1” which I will use for this test.

Note: The total capacity of the datastore is 47.50GB and with my “W7 Test VM” at around 10gb, Free space is 37.95 GB.

Now I am going to demonstrate the Netapp RCU (Rapid Clone Utility) which is provided within the Netapp Virtual Storage Console (VSC). For this test I am using version 4.0 although this functionality has been available in earlier versions.

Here is a screen shot of the vSphere client home screen showing the Netapp VSC Plugin under “Solutions and Applications”.

So lets get started, I will now right click the “W7 Test VM” in vSphere Client, and select the “Netapp” option (provided by VSC) , go to “Provisioning and Cloning” then select “Create Rapid Clones” as is shown in the screen shot below.

Now the “Create Rapid Clones” Wizard appears (see below). All we need to do here is select the “Target Storage Controller” which in my case, is the Netapp Edge VSA (although this could be a physical FAS or IBM N-Series array) then hit “next”.

We then choose what Cluster (or host) the cloned VMs will be hosted, in my case, I used my “MgmtCluster” which has my physical IBM x3850 M2 server, called “pesxi01”. Then we hit next.

We now choose the disk format, in my case, in my lab all VMs are thin provisioned, so selecting “Same format as source” will make the clones Thin Provisioned as well.

Note: No matter which option you choose, it will no effect the speed or disk space usage for this rapid cloning process.

Now we have to select how vCPUs / vRAM / The number of clones we want, along with clone name, starting number etc.

In this case, I choose, 1 vCPU , 1GB vRAM, 100 clones and a clone name of “newclone” with a starting clone number of 001.

The “sample clone names” section demonstrates what the VM names will look like once deployed.

The RCU utility also allows you to import the VMs into a connection broker (like VMware View, as well as other vendors) but for the sake of this post, I wont do this as I don’t have the compute power to actually power on all 100 VMs in my lab.

(I will be doing a number of articles showing various Netapp integration with VMware View in future including RCU and VAAI NFS offloading for Linked clones)

Now we hit “next”.

We now select what datastore we want to store the clones in.

Note: I have selected “NetappEdge_Vol1” which has a total capacity of 47.50gb with only 37.95gb free space.

At this point, you may be thinking, there isn’t enough capacity in the datastore for the 100 clones?  Never fear, I have this covered.

We are now at the end of the RCU wizard, so lets hit “Apply” and see what happens.

So the “Netapp rapid clone virtual machine” task has started, Note the start time of 12:11:08.

In the below screen shot, we can see a number of tasks have started, the first we have discussed (above), so lets talk about the “Netapp initial copy of image” task. This task creates the first clone of the VM.

This task took just one (1) second!  Cool right!?!

Next the “Netapp FlexClone virtual disk” task clones the VMDK of the master VM 100 times into a folder within the datatstore which has “RCU” in the label. This task started at 12:11:20 and completed at 12:19:13, a total of 7mins and 53 seconds.

So that’s just 4.73 seconds per clone.

Next the “Create Virtual Machine” and “Clone Virtual Machine” tasks begin, which I will show more details of shortly.

The below is a screen shot showing what happens following the “Netapp FlexClone virtual disk”.

Note: We observed earlier this takes just 4.73 seconds per clone, even in a test lab!

What we now observe is each clone is created in two (2) steps, the first being the “Clone Virtual Machine” task, which as shown below takes only a few seconds (<5secs) followed by a “Reconfigure virtual machine” task which renames the VM and also only takes a few seconds.

So the above steps continue until it completes for each of the 100 clones.

So how long did it take from start to finish?

I started the task at 12:11:08 and it completed (with all 100 clones registered in vCenter ready to power on) by 12:36:35 – A total of only 25 mins & 27 seconds!

That’s only 15.27 seconds per 10gb VM (average).

Now, at this point, most people don’t believe this is possible, so let me show you what the datastore looks like.

and the below shows the size of one of the clones chosen at random

Now for the really really cool part.

Here is the “NetappEdge_Vol1” datastore after the cloning process.

Note: The free space is 36.72Gb – So if we compare that with the “37.95Gb” (shown earlier in this article) prior to the cloning process, for our 100 cloned VMs of ~10Gb each, we only consumed an additional 1.23Gb of disk space.

It is important to note there is no need to run deduplication over  the underlying volume which hosts this NFS datastore as the VMs have been cloned in a intelligent manner which is already deduplicated. This is important as deduplication processing requires a lot of CPU resources from the array (regardless of storage vendor).

By intelligently cloning using RCU, the array does not need to waste CPU resources doing deduplication on this volume, and therefore has more resources to serve your production workloads eg: Your vSphere clusters.

The below are some stats I collected to demonstrate this process is not in any way disk intensive, and therefore does not impact your ESXi host (or cluster) , the storage area network (being FC / iSCSI / NFS etc) or the underlying disk array.

So I observed a maximum latency of 1ms during the cloning process.

Next we see the disk performance for the during of the process. So we have a peak of 35,000KBps

It is important to note, both the above and below graphs show performance for the ESXi host (pESXi01) so this includes all VMs running on the host, including Netapp Edge VSA, vCenter, a domain controller and numerous other VMs.

So why am I blogging about a storage feature like Netapp’s RCU? Simple, as a Virtualisation architect I need to consider all elements of a solution (like VMware / Storage / Network) to ensure I deliver a solution which meets/exceeds the customers expectations.

One major problem in a lot of existing environments, and designs I review,  is a lot of architects fail to consider the impact of features/functionality (like cloning VMs) on the production environment. This can result in cloning tasks having to be scheduled out of business hours, or on weekends as they are too intensive to run during the day.

For example, if you have a VMware View environment and you need to clone whatever number of VMs, then if you don’t use an intelligent feature like RapidClone then you will be totally reliant on the speed of your array, ESXi hosts and storage network (IP or FC) to get the cloning task/s done.

Even if you have the worlds fastest array (insert you favorite vendor here), storage connectivity and the biggest and most powerful ESXi hosts the process of cloning a large number of virtual machines will still;

1. Take more time to complete than an intelligent cloning process like RCU

2. Impact the performance of your ESXi hosts and more than likley production VMs

3. Impact the performance of your storage network & array (and anything that uses it , physical or virtual).

I would also like to note, the NetappEdge VSA runs pretty much at 100% CPU usage all the time in my home lab and did so specifically during this test as I have other things using the VSA for shared storage.

I suspect if It was running it on newer processors (unlike my very old x3850 m2 CPUs) and I didn’t have other VMs using the VSA for storage (including De duplication being enabled for all volumes) this test would have completed even faster. But, I am very happy with the performance as is.

In conclusion, this test showed the new Netapp Edge VSA is a fully featured Netapp FAS array in a VM which in my opinion is awesome, as well as the RCU can easily and rapidly deploy large numbers of VMs in mins, even in a home test lab!

Fun Fact: The cloning performance I observed during this test easily out performed large production environments that I am aware of!

In Closing , I am not aware of a faster and more disk space efficient method to clone/deploy virtual machines, If you know of one, please let me know.

For more information about the Netapp Edge VSA, Click here to goto Vaughn’s blog. (