ESXi Host Isolation Response and custom isolation address configuration.

I was reviewing a vSphere design recently and I came across an interesting design choice which I thought I would share.

The architect selected the isolation response of “Leave Powered On” and disabled  “das.usedefaultisolationaddress”  (which is by default enabled) and configured multiple custom isolation addresses using the “das.isolationadressX” advanced setting.

The architect explained that this was done to minimize the chance of a false positive isolation event. In many environments such as ones using IP storage or where the ESXi Management VMKernel default gateway is not highly available, this can be a very good idea.

In this environment, the storage was provided via FC and the default gateway was highly available.

So was there a benefit in changing the default setting of “das.usedefaultisolationaddress” and configuring custom isolation addresses?

The short answer is No.

This is because the isolation response is configured with “Leave Powered On” so regardless of the host being isolated or not, the Virtual Machines will remain powered on.

So keep it simple, if your isolation response is “Leave Powered On” there is no need to change either of these advanced settings.

The below articles show examples of isolation response and custom isolation addresses configurations for IP Storage, FC storage and Hyper-converged environments.

Related Articles

1. Host Isolation Response for IP Storage
2. Host isolation response for FC based Storage
3. Host Isolation Response for a Nutanix Environment

Example Architectural Decision – Default Virtual Machine Compatibility Configuration

Problem Statement

In a VMware vSphere 5.5 environment, what is the most suitable configuration for Virtual Machine Compatibility setting at the Datacenter and Cluster layers?


1. vSphere Flash Read Cache is not required.
2. VMDKs of greater than 2TB minus 512b are not required.


1. Reduce complexity where possible.
2. Maximize supportability.

Architectural Decision

Configure the vSphere Datacenter level “Default VM Compatibility” as “ESXi 5.1 or later” and leave the vSphere Cluster level ”Default VM Compatibility” as “Use datacenter setting and host version” (default).


1. Avoid limiting management of the environment to the vSphere Web Client.
2. The Default VM Compatibility only needs to be set once at the datacenter layer and then all clusters within the datacenter will inherit the desired setting.
3. Reduce the dependency of the Web Client in the event of a disaster recovery.
4. As vFRC and >2TB VMDKs and vGPU are not required, there is no significant advantage to HW Version 10.
5. Ensuring a standard virtual machine compatibility level is maintained throughout the environment and reducing the chance of mismatched VM version types in the environment.
6. Simplicity.


1. Virtual Machine Hardware Compatibility automatic update must be DISABLED to prevent the VM hardware being automatically upgraded following a shutdown.
2. vSphere Flash Read Cache (vFRC) cannot be used.
3. VMDKs will remain limited at 2TB minus 512b.


1. Virtual Machine HW Version 10 (vSphere 5.5 onwards).
2. Virtual Machine HW Version 8 (vSphere 5.0 onwards).
3. Virtual Machine HW Version 7 (vSphere 4.1 onwards).
4. Older Virtual machine HW versions.


Virtual Machine Swap File Location & Capacity Usage on Nutanix

The Location of the Virtual Machine swap file can be critical when deploying vSphere with traditional centralized storage solutions, or legacy solutions which acknowledge “zeros” or “White-space” as the Virtual Machine swap file can be as large as the VMs configured vRAM where Memory Reservations are not used.

The below shows the default configuration.

If a VM resides on Tier 1 storage for example, and the VM does not have a memory reservation set (or a reservation of less than 100%), the Swap-file will take up valuable Tier 1 storage capacity.

This can be avoided by specifying a Swap-file datastore however this introduces complexity and in the event the Swap-file datastore is on a low tier of storage, performance in the event of swapping will degrade significantly.

Some platforms recommend having different datastores for VM swap files to minimize the overheads on de duplication or replication for environments using SRM as discussed in Example Architectural Decision – Virtual Machine Swap-file location for SRM Protected VMs.

The Nutanix Distributed File System does not write “White space” to disk, as a result the impact of Virtual Machine swap files is negligible which makes the issue of swap file placement much less of an issue.

The only time when Virtual machine swap files will use storage capacity in the Nutanix Distributed File System is when host memory utilization is >100% and swapping needs to occur.

As such, the default vSphere configuration of “Virtual Machine Directory” is ideal for Nutanix environments and valuable storage capacity is not unnecessarily wasted resulting in increased usable space, reduced complexity by removing the requirement for dedicated swap-file datastores without compromising the benefits of de-duplication and compression.

VCDX Defence Essentials – Part 3 – Preparing for the Troubleshooting Scenario

Following on from Part 1 - Preparing for the Design Defence & Part 2 - Preparing for the Design Scenario, Part 3 covers my tips for the final stage of the VCDX defence, the Troubleshooting Scenario.

After completing the 75min Design defence and the 30min Design Scenario, if your still standing and haven’t retreated at full speed, your final challenge is the 15min Troubleshooting Scenario.

As mentioned in the previous Parts of this series, I am not a official panellist and I do not know how the scoring works. The below is my advice based on conducting mock panels, the success rate of candidates I have conducted mock panels with and my successfully achieving VCDX on the 1st attempt.

If you have read Part 2, then you should notice several similarities in both the common mistakes and tips below.

Common Mistakes

1. Trying to guess the solution to the issue

Taking pot shot guesses at what the problem/s might be does not prove your expertise. If you don’t methodically work through the issue and just keep making guesses, your not doing yourself or the people trying to assess your expertise any good.

2. Not documenting the troubleshooting steps you have completed

Assuming you have not made Mistake #1, and you are methodically working through the troubleshooting scenario, a common mistake I see is a candidate getting confused about what they have or have not investigated.

When candidates repeat the same troubleshooting steps because they have lost track, it does nothing but waste time and does not increase your chance of passing.

15 mins goes by in a flash, you cannot afford to waste time!

3. Going down a rabbit hole

Same as in the design scenario, I have observed many candidates who are clearly very knowledgeable, who have spent the majority of the time troubleshooting one specific area of the environment. eg: Just the vSphere layer

Doing this may demonstrate your expertise in one area really well, but this does not help getting as many potential issues eliminated in the scenario as possible within the time constraint.

4. Being Mute!

Again, same as in the design scenario, I have seen candidates who stand starring at the troubleshooting scenario and the whiteboard for mins at a time.


Tips for the Troubleshooting Scenario

1. Do not try to guess the solution to the issue

If you happen to guess the solution (assuming there is one.. hint hint) what expertise have you demonstrated to the panel for them to score you on? The answer is “bugger all” (This is Australian for “none”).

Talk the panel through your troubleshooting methodology, for example, you might choose to go through OSI models layers, or you may choose to start with, Networking, then move onto Storage, then application, then vSphere etc.

The goal of this section of the defence is to demonstrate your troubleshooting skills, so make sure you explain what your trying to eliminate. eg: If a VM has lost connectivity you may ask the panel to perform a vMotion of VM1 from host A to host B. You could explain to the panel that if the ping begins to work following the vMotion, you plan to investigate the networking of Host A. If the ping does not start working, you will continue to investigate for a larger networking issue, such as a VLAN specific problem.

2. Documenting your troubleshooting steps & findings

Ensure you methodically address each of the key areas of a vSphere solution by writing on the whiteboard headings like the following:

a) Storage/SAN/Protocol

b) Networking/Firewall

c) Compute HW

d) Application/Guest OS

e) vSphere

Ensure you eliminate several (i’d suggest >=3) potential issues in each section, so you are covering off the entire environment and record what you have done & the result of the troubleshooting step.

Keep in mind, you only have 15 mins, so 1 item per min is required if you are to cover all areas off thoroughly.

3. Don’t go down a rabbit hole!

Same as in the design scenario, I have observed many candidates who are clearly very knowledgeable, who have spent the majority of the time troubleshooting one specific area of a vSphere environment. eg: Storage

Doing this may demonstrate your expertise in one area really well, but this does not help getting as many potential issues eliminated in the scenario as possible within the time constraint.

Once you have looked into 3 potential issues in storage, move onto Networking, or vSphere etc.

Do not spend more than 60-90 seconds on any one troubleshooting step as this is preventing you demonstrating broad expertise which is the purpose of VCDX.

4. Think out Loud!

Again, same as in the design scenario, I have seen candidates who stand starring at the troubleshooting scenario and the whiteboard totally silent for mins at a time.

Talk the panel through your thought process and expected outcomes for troubleshooting actions.

I cannot give you advise, if I don’t know what your thinking! Same with the panellists, they can’t score you if you don’t verbalize your thought process.

No matter what, keep thinking out loud, if your working through options in your mind, that’s what the panel want’s to hear, so let them hear it!


I hope the above tips help you prepare for the VCDX design scenario and best of luck with your VCDX journey. For those who are interested, you can read about My VCDX Journey.

If you have any questions on the VCDX process or the advise given in this series please leave your comments and I will compile a list of questions and do a Q&A post.

VCDX Defence Essentials – Part 2 – Preparing for the Design Scenario

Following on from Part 1 - Preparing for the Design Defence, Part 2 covers my tips for the Design Scenario part of the VCDX defence.

After a short break following your 75min Design defence, your neck deep in the Design Scenario. You are presented with a scenario which you need to demonstrate your abilities to gather requirements and while you will not be able to complete a design in 30mins, you should be able to demonstrate the methodology you use to start the process.

As mentioned in Part 1, I am not a official panellist and I do not know how the scoring works. The below is my advice based on conducting mock panels, the success rate of candidates I have conducted mock panels with and my successfully achieving VCDX on the 1st attempt.

Common Mistakes

1. Not gathering and identifying requirements/constraints & risks

The design scenario is very high level, and does not provide you with all the information required to be able to properly start a design. Not identifying and clarifying requirements/constraints and risks will in most cases prevent a candidate from successfully being able to start the design process.

Note: The word “Start” is underlined! You can’t start a design without knowing what your designing for… so don’t make this mistake.

2. Not documenting the requirements/constraints & risks

Assuming you have not made Mistake #1, and you have gathered and clarified the requirements/constraints & risks, the next mistake is not to write them down. I have seen many candidates do an excellent job of gathering the information, to then fall in a heap because they waste time asking the same questions over again because the have forgotten the details.

30 mins is not a long time, you cannot afford to waste time repeating questions.

3. Going down a rabbit hole

I have observed many candidates who are clearly very knowledgeable, who have spent 10-15 mins talking about one topic, such as HA and going into admission control options and pros/cons, isolation response etc. They demonstrated lots of expertise, but this did not help getting as much progress as possible into a design within the time constraint.

The design may be excellent in one key area (eg: HA) but severely lacked in all other areas, which would certainly led to a low score in the design scenario.

4. Not adjusting to changes

The information given to you in the design scenario may not always be correct and may even change half way through the design. Just like in a customer meeting, the customer doesn’t always know the answers to your questions, and may give you an incorrect answer, or simply not know the answer, then later on, realise they gave you incorrect information and correct themselves.

I deliberately throw curve-balls into mock design scenarios and I have observed several times a candidate be say 25 mins into the design scenario and this happens, and they failed to adjust for whatever reason/s.

5. Being Mute!

I have seen candidates who stand starring at the whiteboard, or drawing away madly, while completely mute. Then after 5-10 mins of drawing/thinking candidates then talk about what they came up with.

Do you stand in customer meetings mute? No! (Well, you shouldn’t!)


Tips for the Design Scenario

1. Clarify the Requirements/Constraints

Start by clarifying the information that has been provided to you. The information provided may be contradictory, so get this sorted before going any further.

2. Write the requirements/constraints & risks on the Whiteboard

Once you have clarified a piece of information that has been provided to you, write it on the whiteboard under section heading, such as:

a) Requirements

b) Constraints

c) Risks

d) Assumptions

Now, you can quickly review these items, without having to remember everything and if a curve-ball is thrown at you, you can cross out the incorrect information and write down the correct info and this may assist you modifying your design to cater for the changed requirement/constraint etc.

As you work through the scenario, you may be able to clarify an assumption, so you can remove it as an assumption/risk, this shows your working towards a quality outcome.

3. Write down your decisions!

Ensure you address each of the key areas of a vSphere solution by writing on the whiteboard headings like the following:

a) Storage

b) Networking

c) Compute

d) Availability

e) Datacenter

Ensure you write down at least 3 items per section, so you are covering off the entire environment.

As you make a design choice, write it down, eg: under storage, you may be recommending or constrained to use iSCSI, so write it down. iSCSI / Block storage.

So, aim to have 5 section headings like the above examples, and at  least 3 items per heading by the end of 30 mins. If you do the math, that’s only 6 mins per section, or 2 mins per item so make them count.

eg: Availability does not just mean N+1 vSphere cluster, what about say, environmental items such as UPS? A successful VCDX level design is not just about vSphere.

4. Verbalize your thought process.

I cannot give you advise, if I don’t know what your thinking! Same with the panellists, they can’t score you if you don’t verbalize your thought process.

No matter what, keep thinking out loud, if your working through options in your mind, that’s what the panel want’s to hear, so let them hear it!

If you are mute for a large portion of the 30 mins, the lower the chances you have of increasing your score.

5. Show how you adjust to changes in requirements/constraints/assumptions!

As a VCDX candidate, your most likely an architect day to day, so you would have dealt with this many times in real life, so deal with it in the design scenario!

If your 25 mins into the design scenario, and the panel suddenly tells you the CIO went out drinking on the weekend with his new buddy at storage vendor X and decided to scrap the old vendors storage and go for another vendor, deal with it!

Talk about the implications of moving from vendor X to vendor Y, for example FC to NFS and how this would change the design and would it still meet the requirements or would it be a risk?

6. Don’t be afraid to draw diagrams – but don’t spend all day making it pretty!

Use the whiteboard to draw your solution as it develops, but don’t waste time drawing fancy diagrams. A square box with ESXi written in it, is a Host, it doesn’t need to be pretty.

eg: If your drawing a 16 node cluster, draw three squares, Labelled ESXi01, ESXi…. and ESXi16, don’t draw 16 boxes, this adds no value, wastes time, and makes the diagram harder to draw.



I hope the above tips help you prepare for the VCDX design scenario and best of luck with your VCDX journey. For those who are interested, you can read about My VCDX Journey.

In Part 3, I will go through Preparing for the Troubleshooting Scenario, and how to maximize your 15 mins.

VCDX Defence Essentials – Part 1 – Preparing for the Design Defence

Even before I achieved my VCDX in May 2012, I had been helping VCDX candidates by doing design reviews and more importantly conducting mock panels.

So over the last couple of years I would estimate I would have been involved with at least 15 candidates, which range from a mock panel and advice over a WebEx, to mentoring candidates through their entire journey.

I would estimate I have conducted easily 30+ mock panels, from which I have decided to put together the most common mistakes candidates make, along with my tips for the defence.

Note: I am not a official panellist and I do not know how the scoring works. The below is my advice based on conducting mock panels, the success rate of candidates I have conducted mock panels with and my successfully achieving VCDX on the 1st attempt.

Common Mistakes

1. Using a Fictitious Design

In all cases where I have done a mock panel for a candidate using a Fictitious design, even in cases where I did not know it was fictitious, it becomes very obvious very quickly.

The reason it is obvious for a mock panellist is due to the lack of depth the candidate can go into about the solution, for example, requirements.

In my experience, candidates using fictitious designs generally take multiple attempts before successfully defending.

This may sound harsh, but if you need to use a fictitious design, you probably don’t have enough architecture experience for VCDX, otherwise you would be able to choose from numerous designs to submit, rather than creating a fictitious design.

Some candidates use fictitious designs for privacy or NDA reasons, in this case, I would strongly recommend you should be able to remove customer specific details and defend a real design.

Note: For those VCDXs who have passed with a fictitious design, I am not in any way taking away from there achievement, if anything, they had more of a challenge that people like me who used a real design.

2. Giving an answer of “I was not responsible for that portion of the design”.

If you give this answer, you are demonstrating that you do not have an expert level understanding of the solution as a whole, which translates to Risk/s as the part of the design you were responsible for may not be compatible with the component/s you were not involved in.

A VCDX candidate may not always be the lead architect on a project, but a VCDX level candidate will always ensure he/she has a thorough understanding of the total solution and will ask the right questions of other architects involved with the project to gain at least a solid level of understanding of all parts of the solution.

With this understanding of other components of the total solution, a candidate should be able to discuss in detail how each component influenced other areas of the design, and what impact (positive or negative) this had on the solution.

3. Not knowing your design! 

This is the one which surprises me the most, if your considering submitting for VCDX, or already have submitted and been accepted, you should already know your design back to front, including the areas which you may not have been responsible for.

You should not be dependant on the power point presentation used in your defence as this is really for the benefit of the panellists, not for you to read word for word.

Think about the VCDX panel this way, You (should) know more about your design that the panel does for the simple reason, its your design.

So you have an advantage over the panellists – ensure you maximize this advantage by knowing your design back to front.

If you cannot comfortably talk about your design for 75mins without referring to reference material, you probably should review your design until you can.

4. Not having clear and concise answers of varying depths for the panellists questions

I hear you all saying, how the hell do I know what the panel will ask me? As a VCDX candidate preparing to defend, your basically saying, I am an Expert in virtualization and I want to come and have my expertise validated.

As an Expert (not a Professional or Specialist, but an EXPERT) you should be able to go through your design, and with your reviewers hat on, write down literally 50+ questions that you would ask if you were reviewing this document for somebody else, or indeed, acting as a real or mock panellist.

In my experience, I predicted approx 80% of the questions the panel ended up asking me, which made my defence a much less stressful experience than it may have been otherwise.

Once you have written down these questions (seriously, 50+), you should ask yourself those questions and ensure you have answers to them. The answers you should have, or prepare should be

a) 1st Level – 30 seconds or less which cover the key points at a high level

b) 2nd Level – A further 30 – 60 seconds which expands on (and does not repeat) the 1st level statements

c) 3rd Level – A further 30-60 seconds which is very detailed and shows your deep understanding of the topic.

I would suggest if you don’t have solid 1st and 2nd Level answers to the questions, your probably not ready for VCDX. The 3rd level questions, in some areas you should be strong and be able to go to this depth, in other areas, you may not, but you should prepare regardless and focus on your weak areas.

5. Giving BS answers

I can’t put this any nicer, if you think you can get away with giving a BS answer to the VCDX panel, or even a good mock panellist, your sorely mistaken.

It never ceases to amaze me, people seem to refuse to admit when they don’t know something – nobody knows everything, don’t be afraid to say, I Don’t know.

Don’t waste the precious time that you have to demonstrate your expertise by giving BS answers that will do nothing to help your chances of passing.

In a mock panel situation, take note of any questions your asked, which you dont know, or dont have strong answers too, and review your design, do some research and ensure you understand the topic in detail and can speak about it.

This may result in you finding a weakness in your design which even if your design has been accepted already, you have the chance to highlight these weaknesses in your defence and discuss the implications and what you would/could to differently – this is a great way to demonstrate expertise.

6. Not knowing about Alternatives to your design

If you work for Vendor X, and Vendor X has a pre packaged converged solution, with a cookie cutter reference architecture which you customize for each client, you could have been successfully deploying solutions for years and be an expert in that solution, but this alone doesn’t make you a VCDX, in fact it could mean quite the opposite.

If your solution is a vBlock, with Cisco UCS, EMC storage (FC/FCoE) and vSphere, how would your solution be impacted if the customer at the last min said, we want to use Netapp storage and NFS or what about if the customer dropped EMC/Netapp and went for Nutanix. How would the solution change, what are the pros and cons and how would this impact your vSphere design choices?

If you can’t talk to this, in detail, for example the Pros and Cons of for example

a) Block vs File based storage

b) Blade vs Rack mount

c) Enterprise Plus verses Standard Edition for your environment

d) Isolation response for Block verses IP storage

Then your not a Design Expert, your at best a Vendor X solution specialist.

If you do use a FlexPod, vBlock or Nutanix type solution with a reference architecture (RA) or best practices, you should know the reasons behind every decision in the reference architecture as if you were the person who wrote the document, not just customized it.

Tips for the defence

1. Answer questions before they are asked

Expanding on Common Mistake #4, as previously mentioned, you should be able to work out the vast majority of questions the panel will ask you, by reviewing your design and having others also review your work.

With this information in mind, as you present your architecture, use statements such as

“I used the following configuration for reasons X,Y,Z as doing so mitigated risks 1,2,3 and met the requirements R01,R02″.

This technique allows you to demonstrate expertise by showing you understand why you made a decision, the risks you mitigated and the requirements you met, without being asked a single question.

So what are the advantages of doing this?

a) You demonstrate expertise while saving time therefore maximizing your chance of a passing mark

b) You can prepare these statements, and potential avoid being interrupted and loosing your train of thought.

2. When asked a question, don’t be too long winded.

As mentioned in Common Mistake #4, preparing short concise answers is critical. Don’t give a 5 min answer to a question as this is likely to be wasting time. Give your level 1 answer which should cover the key concepts and decision points in around 30 seconds, and if the panel drills down further, give your Level 2 answer which expands on the Level 1 answer, and so on.

This means you can maximize your time to maximize your score in other areas. If the panel is not satisfied with your answer, they will ask it again, time permitting.

Which leads us nicely onto the next item:

3. If a question is asked twice, go straight to Level 2 answer

If the panel has asked you a question, and you gave the Level 1 answer, and later on your asked the same question again, its possible you gave an unclear or incorrect answer the first time, so now is your chance to correct a mistake or improve your score.

Think about then answer you gave previously, if you made a mistake for any reason, call it out by saying something like, “Earlier I mistakenly said X, however the fact/s are….”

This will show the panel you know you made a mistake, and you do in fact know the correct answer or the topic in question.

4. Near the end of your Defence give more detailed answers

In the first half of the 75mins, giving your Level 1 and maybe Level 2 answers allows you to save time and maximize your score across all areas.

As you get pass the half way mark and nearing the 20 min remaining mark, at this stage, you should have gone over most areas of your design and now is the chance to maximize your score.

When asked questions at this stage, I would suggest the Level 2/3 answers are what you should be giving. Where you may have given a good Level 1 answer, now is the chance to move from a good answer, to a great answer and maximize your score.


I hope the above tips help you prepare for the VCDX defence and best of luck with your VCDX journey. For those who are interested, you can read about My VCDX Journey.

In Part 2, I will go through Preparing for the Design Scenario, and how to maximize your 30 mins.


What does Exchange running in a VMDK on NFS datastore look like to the Guest OS?

In response to the recent community post “Support for Exchange Databases running within VMDKs on NFS datastores” , the co-authors and I have received lots of feedback, of which the vast majority has been constructive and positive.

Of the feedback received which does not fall into the categories of constructive and positive, it appears to me as if this is as a result of the issue is not being properly understood for whatever reason/s.

So in an attempt to help clear up the issue, I will show exactly what the community post is talking about, with regards to running Exchange in a VMDK on an NFS datastore.

1. Exchange nor the Guest OS is not exposed in any way to the NFS protocol

Lets make this very clear, Windows or Exchange has NOTHING to do with NFS.

The configuration being proposed to be supported is as follows

1. A vSphere Virtual Machine with a Virtual SCSI Controller

In the below screen shot from my test lab, the highlighted SCSI Controller 0 is one of 4 virtual SCSI controllers assigned to this Virtual machine. While there are other types of virtual controllers which should also be supported, Paravirtual is in my opinion the most suitable for an application such as Exchange due to its high performance and low latency.


2. A Virtual SCSI disk is presented to the vSphere Virtual Machine via a Virtual SCSI Controller

The below shows a Virtual disk (or VMDK) presented to the Virtual machine. This is a SCSI device (ie: Block Storage – which is what Exchange requires)

Note: The below shows the Virtual Disk as “Thin Provisioned” but this could also be “Thick Provisioned” although this has minimal to no performance benefit with modern storage solutions.


So now that we have covered what the underlying Virtual machine looks like, lets see what this presents to a Windows 2008 guest OS.

In Computer Management, under Device Manager we can see the expanded “Storage Controllers” section showing 4 “VMware PVSCSI Controllers”.


Next, still In Computer Management, under Device Manager we can see the expanded “Disk Drives” section showing a number of “VMware Virtual disk SCSI Disk Devices” which each represent a VMDK.








Next we open “My Computer” to see how the VMDKs appear.

As you can see below, the VMDKs appear as normal drive letters to Windows.



Lets dive down further, In “Server manager” we can see each of the VMDKs showing as an NTFS file system, again a Block storage device.


Looking into one of the Drives, in this case, Drive F:\, we can see the Jetstress *.EDB file is sitting inside the NTFS file system which as shown in the “Properties” window is detected as a “Local disk”.


So, we have a Virtual SCSI Controller, Virtual SCSI Disk, appearing to Windows as a local SCSI device formatted with NTFS.

So what’s the issue? Well as the community post explains, and this post shows, there isn’t one! This configuration should be supported!

The Guest OS and Exchange has access to block storage which meets all the requirements outlined my Microsoft, but for some reason, the fact the VMDK sits on a NFS datastore (shown below) people (including Microsoft it seems) mistakenly assume that Exchange is being serviced by NFS which it is NOT!



I hope this helps clear up what the community is asking for, and if anyone has any questions on the above please let me know and I will clarify.

Related Articles

1. ”Support for Exchange Databases running within VMDKs on NFS datastores

2. Microsoft Exchange Improvements Suggestions Forum – Exchange on NFS/SMB

Virtualizing Exchange on vSphere with NFS backed storage?

For many years, customers have been realising the benefits of file based storage from one or more of the many storage vendors offering NFS.

NFS makes a ton of sense for virtualization, and virtualizing Business Critical applications such as Exchange, along with the rest of a company’s servers, can be a great way to reduce complexity and save on CAPEX/OPEX.

However, some vendors, have licensing or support statements which make this more difficult than it needs to be.

One such vendor is Microsoft.

Microsoft currently don’t support Exchange running inside a VMDK on an NFS datastore, even though the VMDK is a virtual SCSI device and acts/performs the same as if it was on a block based LUN, such as FC/FCoE or iSCSI.

I decided to reach out to a bunch of great guys in the virtualization community to try and get some awareness of this issue, and get Microsoft to update the outdated and technically invalid support statement.

As a result, the following TechNet forum article has been posted

Support for Exchange Databases running within VMDKs on NFS datastores

There is also a suggestion in the Microsoft Product improvement forum on the same topic, which as a result of the communities efforts in the past few weeks, have seen it sky rocket to the #1 improvement suggestion to microsoft.

The post and voting can be found here.

Support storing Exchange datat on VMDKs on File shares (NFS/SMB)

So please check out these two articles, and vote and leave your comments in support of this issue. Supporting Exchange in VMDKs on NFS is a No lose situation for customers, and that is what it is all about!

Unlimited VMs per datastore? Its not a myth with Nutanix!

For many years, I have been asked on countless occasions questions relating to how many VMs can (or should) be placed in one datastore.

In fact, just this morning I was asked this same question, and I decided to whip up a quick post.

I have previously posted an Example Architectural Decision relating to Datastore sizing for Block based storage. What this example was aimed to show was a how things like RPO/RTO and performance should be taken into consideration when choosing a datastore size.

The above example is not a hard and fast rule, but an example of one deployment which I was involved in.

There is a great article written on this topic by VCDX, Jason Boche (@jasonboche), titled  ”VAAI and the Unlimited VMs per Datastore Urban Myth” which covers in great detail this topic as it relates to block based storage, being iSCSI, FC & FCoE.

But what about NFS, and what about with Hyper-converged solutions like Nutanix?

NFS has gained significant popularity in recent years, and in my opinion, people who know what they are talking about, no longer refer to NFS as “Tier 3 Storage” which was once common.

With traditional storage solutions, generally only a smaller number of controllers can actively serve IO to the one NFS mount, so the limiting factor preventing running more virtual machines per NFS mount, in my experience was performance but things like RPO/RTO were and are important considerations.

NFS does not suffer from SCSI reservations which resulted in increased latency ,which is what VAAI, specifically the Atomic Test & Set or ATS primitive helped too all but eliminate for block based datastores.

LUNs are limited by there queue depth, which in most cases is 32 (sometimes 64). This is also a limiting factor, as all the VMs in a datastore (LUN) share the same queue which can lead to contention. SIOC helps manage the contention by ensuring fairness based on share values, but it does not solve the issue.

NFS on the other hand has a much larger queue depth, in fact its basically unlimited as shown below.


So as NFS does not suffer from SCSI reservations, or queue depth issues, what is limiting us having hundreds or more VMs per datastore?

It comes down to how many active storage controllers are able to service the NFS mount, and the performance of the storage controller/s. In addition to this your business requirements around RPO/RTO. In other words, if a NFS mount is lost, how quickly can you recover.

For most traditional shared storage products,

1. Have only 1 or 2 active controllers – thus potentially limiting performance which would lead to lower VMs per NFS datastore.

2. Do snapshots at the NFS mount layer, so if you need to recover an entire NFS mount, the larger it is, the longer it may take.

For Nutanix, by default, NFS is used to present the Nutanix Distributed File System (NDFS) to vSphere, however the key difference between Nutanix and traditional shared storage is every controller in the Nutanix cluster, can and does Actively serve IO to any datastore in the cluster concurrently.

So the limit from a performance perspective is gone thanks to Nutanix scale out, shared nothing architecture, with one virtual storage controller (CVM) per Nutanix node. The number of nodes that’s can be scaled too, is also unlimited. An example of Nutanix ability to scale can be found here – Scaling to 1 million IOPS and beyond, Linearly!

Next what about the RPO/RTO issue? Well, Nutanix does not rely on LUNs or NFS mounts for our data protection (or snapshots), this is all done at a VM layer so your RPO/RTO is now per VM, which gives you much more flexibility.

With Nutanix, you can literally run hundreds or even thousands of VMs per NFS datastore, without performance or RPO/RTO problems thanks to scale out, shared nothing architecture and the Nutanix Distributed File System.

There are some reasons why you may choose to have multiple NFS datastores even in a Nutanix environment, these include, if you want to enable Compression and/or De-duplication which are enabled/disabled on a per container (or datastore) level. As some workloads don’t compress or dedupe well, these types of workloads should be excluded to reduce the overhead on the cluster.

It is important to note, Nutanix uses a concept called a “Storage Pool” which contains all the storage for the Nutanix cluster. On top of a “Storage Pool” you create “Containers” (or datastores). This means regardless of if you have 1 or 100 datastores, they all still sit on top of the one “Storage Pool” which means you still have access to the same amount of storage capacity, with no silos for maximum capacity utilization (and performance!).

Lastly, Nutanix does not suffer from the same availability concerns as traditional shared storage where a single LUN could potentially be lost. This is due to the distributed architecture of the Nutanix solution. For more information on how Nutanix is more highly available than traditional shared storage, check out “Scale out, Shared Nothing Architecture Resiliency by Nutanix

Check out a screen shot of one cluster with ~800 VMs on a single datastore. Note: The sub millisecond latency and 14K IOPS w/ ~900MBps throughput. Not bad!


My VCAP5-CID (Cloud Infrastructure Design) Exam Experience

Yesterday (17th December 2013) I sat and passed my VMware Advanced Certified Professional 5 – Cloud Infrastructure Design exam, a.k.a VCAP5-CID.

Having sat 4 other VCAP exams, including 3 design exams (DCD4,DCD5 & DTD5) I was confident on what to expect in regards to the exam format, the visio style design tool and the fact that time management has always been key.

So the exam is (as per the blueprint which can be found here)

115 Questions including a mix of multiple-choice, drag-and-drop items and specialized design items

195 Minutes
So lets break this down a bit, 195 mins divide 115 questions is 1.6 mins (or 100 seconds) per question, that’s not a lot when you have 6 x visio style designs to create which can take 5-10 mins each.

So this brings me straight to the first Tip.

Tip # 1 – Time Management

As of yesterday you still cannot go back and review previous questions/answers, so you must move through the exam to be able get to & answer the valuable visio style design and also the drag/drop questions.

Allow for 5-10 mins per Visio style question (These count big on the score, DO NOT RUSH THEM!!)
Allow for 2-5 mins per Drag and Drop style question (maybe 10 in the exam)
Multiple Choice questions you should spent between 20-45 seconds on maximum – If you don’t know the answer, have an educated guess and move on, its not the end of the world if you get some multiple choice questions wrong.

I must say I always like getting visio questions early on, as these are well known to make up a significant part of the score (~50%) and I don’t like being in a position where I have to rush something I know is important.

In this case, my visio style questions where spread evenly throughout the exam, and the last of the 6 was in the last 10 questions, so make sure you manage your time so you can get to, and hopefully answer correctly ALL the visio style questions.

Tip # 2 – Know the Blueprint (properly!)

I found quite a few things I glossed over in the blueprint were covered fairly well in the exam so be prepared to be tested on a wide range of vCloud related topics.

So while you may have good experience in designing vCloud Environments, if you don’t for example work for a service provider, you may have not had much (or any) experience with Chargeback, but this is a part of a vCloud solution and is rightly covered on the exam.

These types of things may catch you off guard, at the depth of some of the questions, but hey, this is a VCAP level exam, not VCP level, so its no meant to be easy.

Tip # 3 – Create a Study Group

I’ll be honest, I felt I had a pretty good preparation for the exam, albeit with some significant distractions in my personal life, and this was because I worked in a study group with two great guys (@Grantorchard & @wheatcloud), who have years of industry experience which made for excellent debates throughout the study process.

Working in a study group is what I credit at least some of my being able to successfully achieve VCDX on the first attempt. In this case, it helped me identify my own weaknesses (yes even VCDXs have weaknesses!) so I could brush up on those areas.

So get a group of people together and work towards VCAP-CID over weeks or months depending on your groups level  of experience.

Tip # 4 – Whiteboard vCloud Solutions

I would recommend for anyone taking the VCAP-CID (or in fact the VCAP-DCD or VCAP-DTD) spend some time on a whiteboard, drawing things like

1. vApp / OrgVDC and External Networking
2. Highly available Chargeback solutions
3. vSphere to Provider VDC to OrgVDC solutions

Get the study group take turns to pose scenarios for one group member to whiteboard a possible solution and discuss what is drawn and the pros/cons and if the solution meets the requirements or not. This will help you practice turning scenarios into diagrams, which you need to be able to do quickly in the exam or you risk running out of time.

General Comments

Overall I would say the VCAP-CID was the least refined VMware exam I have sat, and in fairness this is probably due to the exam being quite new, and im sure a much lower number of participants than other VCAP exams like DCD and DCA.

I spoke with the team who develop the exam and they were very pleased to get feedback on the exam, and much to there credit, acknowledged that most of my feedback was at least in part justified. I hope my feedback will help make the VCAP-CID a better exam, like the rest of the VCAPs.

I found the visio style design tool in at least one case, could not do what I was trying to due which may be a bug with the tool or similar, but this I believe prevented me from completing the question & potentially scoring higher.

I found quite a number of questions (both visio style , drag/drop and multiple choice) appeared (and I say appeared as you don’t have time to re-read every question 5 times to clarify the question) not to have sufficient information to choose between say Option A and Option B – which led to my having to make an assumption, or simply guess.

I think as more and more people sit the exam, as long as feedback is captured by as many participants as possible, the exam could quickly be brought up to the high standard of the other VCAP exams.

While this exam was not the best exam experience I’ve had, I would still recommend anyone who is involved with architecture of vCloud solutions to challenge yourself, prepare for and sit this exam.

vCloud will be around for many years to come, and over time vCAC will creep into the exam, or maybe have its own exam, but there is plenty of value testing your skills and certifying your advanced level knowledge of a major VMware product.

If you are up for the challenge, Best of luck with your VCAP-CID preparations and exam!