Ignore the nonsense on twitter, What does “NoSAN” mean?

Posted on December 8, 2014 by Josh Odgers

Every now and again I see nonsense on twitter which I feel needs to be responded too. The reason I am responding today is to correct mis-information about what Nutanix NoSAN is.

Earlier today a competitor of Nutanix tweeted the following:

I responded to the above with the following tweet:

To which the person responded with this:

I responded with the below and the conversation ended with the following tweet:

So before I correct the mis-information, let me briefly explain what “SAN” is:

“SAN” or “Storage Area Network” describes the connectivity between a compute node and a storage device (such as a central storage array or disk system). You can for example buy SAN (or FC) Switch/es from companies like Brocade.

However the I.T industry has for whatever reason over the years has made “SAN” mean “Central Disk System / Storage array” so for the purpose of this post, “SAN” is a Traditional Centralized Storage array (SAN/NAS).

So let’s correct the mis-information:

Claim 1: With Nutanix there is a SAN that is auto managed.

Fact: There is no centralized storage with Nutanix

Nutanix software running NDFS (Nutanix Distributed File System) logically presents DAS storage as shared storage across 3 or more nodes via NFS or SMB 3.0 to ESXi, Hyper-V or KVM. Note: While Nutanix supports iSCSI, its not recommended as it creates unnecessary complexity and has no technical advantages.

All Nutanix nodes have local DAS storage which is presented logically as shared storage and there is no “central” Nutanix nodes.

Note: Nutanix nodes can connect to traditional central SAN/NAS storage (see : Can I use my existing SAN/NAS storage with Nutanix) but this is not Nutanix native architecture.

SAN’s also have key characteristics such as Zoning, Masking, LUNs, RAID, SANs also typically use Fibre Channel (FC) connectivity over dedicated fabrics although this is not always the case.

With Nutanix, There is no:

1. Central storage (SAN or NAS based)
2. LUNs
3. LUN masking
4. Zoning
5. Storage Controller “Pairs”
6. Dedicated Storage Fabric
7. Silos of storage capacity
8. RAID

Therefore the statement about Nutanix being a SAN that is “auto managed” is simply incorrect.

If a SAN “auto manages” LUNs, Zoning, Masking etc its just a smarter SAN, the problems with SAN (and NAS) cannot be solved by simply “masking” the complexity. (Pun intended)

Claim 2: NDFS is a distributed storage array.

Fact: NDFS is a file system, not a storage array.

Nutanix Distributed File System (NDFS) makes up part of the Nutanix solution, it is not a storage array and it is not centralised storage either.

Nutanix is a scale out shared nothing platform where data is written locally where the VM is running and in a distributed (not centralized) manner across nodes.

So what does NoSAN mean to me?

1. No centralized storage array
2. No LUNs, Zoning , Masking , RAID
3. No dedicated storage fabric (e.g.: Fibre Channel Switches)
4. Reduced complexity
5. No Silos of capacity
6. No Storage Controller “Pairs”

I could go on but I think you get the point.

In conclusion, don’t believe what you hear on social media (especially from competitors of a product) and do your own research and validate your findings from multiple sources.

Example Architectural Decision – Horizon View Desktop Power Policy for Linked Clones (1 of 2)

Posted on November 13, 2014 by Josh Odgers

Problem Statement

In a VMware Horizon View environment using persistent Linked Clones, Disposable disks are being used to redirect transient paging and temporary files to a separate VMDK.

What is the most suitable Desktop Pool setting to ensure storage overheads are reduced?

Assumptions

1. VMware View 4.5 or later
2. Recompose / Refresh cycles are infrequent
3. Desktop Usage concurrency within the pool is less than 100%
4. Memory Reservations are not being used.

Requirements

1. The environment must deliver consistent performance
2. Minimize the cost/utilization of shared storage

Motivation

1. Reduce complexity where possible.
2. Maximize the efficiency of the infrastructure

Architectural Decision

Set the Power Policy for all Linked Clone desktop pools to “Power Off”

Justification

1. Using disposable disks can save storage space by slowing the growth of linked clones and reducing the space used by powered off virtual machines.
2. Using the “Power Off” policy for the pool means at user logoff (or shutdown) the disposable disk will be refreshed, therefore reducing the capacity usage at the storage layer.
3. “Powered Off” VMs do not have a Virtual Machine SWAP file which will also reduce storage consumption.

Implications

1. Setting the policy to “Power Off” will result in more frequent power operations which may impact the performance of the storage and vCenter.
2. When a user attempts to login to a desktop which has been powered off, there will be a delay while the VM is powered on and booting up before the user will be logged in.
3. The peak concurrency rate of users will need to be understood to allow accurate storage planning for the VSWAP file.

Alternatives

1. Increase the frequency of Recompose / Refresh / Rebalance operations
2. Set the Policy to “Take no power action” and schedule an Administrator task to periodically change the Power Policy to “Powered Off” during a maintenance window.
3. Set the Policy to “Ensure desktops are always powered on” and schedule an Administrator task to periodically change the Power Policy to “Powered Off” during a maintenance window.
4. Set the Policy to “Suspend” and schedule an Administrator task to periodically change the Power Policy to “Powered Off” during a maintenance window, however this will consume extra storage for the Suspend File.
5. Use Memory Reservations to reduce storage requirements for vSwap and leave Power Policy to “Always On”.

Related Articles:

The example architectural decision was contributed to by Travis Wood (@vTravWood) and was inspired by the following article:

1. Understanding View Disposable Disks by @vTravWood (Double VCDX #97 Desktop/Datacenter Virtualization)

1. Transparent Page Sharing (TPS) Configuration for VDI (1 of 2)

2. Transparent Page Sharing (TPS) Configuration for VDI (2 of 2)

Integrity of I/O for VMs on NFS Datastores – Part 1 – Emulation of the SCSI Protocol

Posted on November 3, 2014 by Josh Odgers

This is the first of a series of posts covering how the Integrity of I/O is ensured for Virtual Machines when writing to VMDK/s (Virtual SCSI Hard Drives) running on NFS datastores presented via VMware’s ESXi hypervisor as a “Datastore”.

Note: To be crystal clear, this post is not talking about presenting NFS direct to Windows or any other guest operating system.

This process is patented (US7865663) by VMware and its inventors and on the patent the process is called “SCSI Protocol Emulation”.

This series will first cover the topics in a vendor agnostic manner, meaning I am talking in general about VMware + any NFS storage on the VMware HCL with NFS support.

Following the vendor agnostic posts, I will follow with a series of posts focusing specifically on Nutanix, as the motivation for the series was to cover off this topic for existing or potential Nutanix customers, some of whom are less familiar with NFS and have asked for clarification, especially around virtualizing Business Critical Applications (vBCA) such as Microsoft SQL and Exchange.

The below diagram visualizes shows how storage can be presented to an ESXi host and what this series will focus on.

A VM accesses its .vmx and .vmdk file/s via a datastore the same way, regardless of the underlying storage protocol (DAS SCSI, iSCSI , NFS , FCP).

In the case of NFS datastores, SCSI protocol emulation is used to allow the Guest Operating System (OS) and application/s to read and write via SCSI even when the underlying storage (which is abstracted by the hypervisor) is served via NFS which does not natively support the same commands.

Image Source: https://pubs.vmware.com/vsphere-50/index.jsp?topic=%2Fcom.vmware.vsphere.introduction.doc_50%2FGUID-2E7DB290-2A07-4F54-9199-B68FCB210BBA.html

In the following section, and throughout this series, many images shown are from the patent (US7865663) and are the property of the patent owners, not the author of this article.

The areas which I will be focusing on are the ones where there has been the most concern in the industry, especially for business critical applications, such as Microsoft SQL and Microsoft Exchange, being how are the VM operating system and application/s (or data integrity) are impacted when issuing commands when the storage is abstracted by the hypervisor and served to via NFS which does not have equivalent I/O commands as SCSI.

Some examples areas of concern around the industry for VMs running on datastores backed by NFS are:

1. SCSI Aborts / Resets
2. Forced Unit Access (FUA) & Write Through
3. Write Ordering
4. Torn I/O (Writes + Reads)

In this first part, we will look at the SCSI Protocol Emulation process and discuss SCSI Aborts and Resets and how the SCSI protocol emulation process deals with these.

Below is a diagram showing the flow of an I/O request for a VM writing SCSI commands to a VMDK (formatted as NTFS) through the SCSI emulation process and through to the NFS storage.

The first few steps in my opinion are fairly self explanatory, where it gets interesting for me, and one of the points of contention among I.T professional (being SCSI aborts) is described in the box labelled “550“.

If the SCSI command is an abort (which has no equivalent in the NFS protocol), the SCSI emulation process removes the corresponding request from the virtual SCSI request list created in the previous step (box labelled “540“).

The same is true if the SCSI command is a reset (which also has no equivalent in the NFS protocol), the SCSI emulation process removes the corresponding request from the virtual SCSI request list. This process is shown below in the box labelled “560”

Next lets look at what happens if the SCSI “abort” or “reset” command is issued after the SCSI emulation process has passed on the command to the storage and is now receiving a reply to a command which the Guest OS / Application has aborted?

Its quite simple, the SCSI emulation process receives a reply from the NFS server, looks up the corresponding tag in the Virtual SCSI request list, and because this corresponding tag does not exist, the emulator drops the reply therefore emulating a SCSI abort command.

The process is shown below from box labelled “710” to “720” and finishing at “730“.

In the patent, the above process is summed up nicely in the following paragraph.

Accordingly, a faithful emulation of SCSI aborts and resets, where the guest OS has total control over which commands are aborted and retried can be achieved by keeping a virtual SCSI request list of outstanding requests that have been sent to the NFS server. When the response to a request comes back, an attempt is made to find a matching request in the virtual SCSI request list. If successful, the matching request is removed from the list and the result of the response is returned to the virtual machine. If a matching request is not found in the virtual SCSI request list, the results are thrown away, dropped, ignored or the like.

So there we have it, that is how VMware’s patented SCSI Protocol emulation allows SCSI commands not supported natively by NFS to be honoured, therefore allowing applications dependant on Block based storage to be ran successfully within a VM where its VMDK is backed by NFS storage.

Let’s recap what we have learned so far.

1. The SCSI Commands, abort & reset have no equivalent in the NFS protocol.
2. The VMware SCSI Emulation process handles SCSI commands not supported natively by NFS thanks to the Virtual SCSI Request List.
3. Guest Operating Systems and Applications running in Virtual Machines on ESXi issue native SCSI commands to the NTFS volume, which is presented to the VM via a VMDK and housed on an NFS datastore.
4. The underlying NFS protocol is not exposed to the Guest OS, Application/s or Virtual Machine.
5. The SCSI Commands, abort & reset are emulated by the hyper visor through removing these requests from the Virtual SCSI emulation list.

In part two, I will discuss Forced Unit Access (FUA) & Write Through.

Integrity of Write I/O for VMs on NFS Datastores Series

Part 1 – Emulation of the SCSI Protocol
Part 2 – Forced Unit Access (FUA) & Write Through
Part 3 – Write Ordering
Part 4 – Torn Writes
Part 5 – Data Corruption

Nutanix Specific Articles

Part 6 – Emulation of the SCSI Protocol (Coming soon)
Part 7 – Forced Unit Access (FUA) & Write Through (Coming soon)
Part 8 – Write Ordering (Coming soon)
Part 9 – Torn I/O Protection (Coming soon)
Part 10 – Data Corruption (Coming soon)

Related Articles

1. What does Exchange running in a VMDK on NFS datastore look like to the Guest OS?
2. Support for Exchange Databases running within VMDKs on NFS datastores (TechNet)
3. Microsoft Exchange Improvements Suggestions Forum – Exchange on NFS/SMB
4. Virtualizing Exchange on vSphere with NFS backed storage?

CloudXC

By Josh Odgers – VMware Certified Design Expert (VCDX) #90

Tag Archives: NAS

Ignore the nonsense on twitter, What does “NoSAN” mean?

Example Architectural Decision – Horizon View Desktop Power Policy for Linked Clones (1 of 2)

Integrity of I/O for VMs on NFS Datastores – Part 1 – Emulation of the SCSI Protocol

This process is patented (US7865663) by VMware and its inventors and on the patent the process is called “SCSI Protocol Emulation”.

Share this:

Share this:

This process is patented (US7865663) by VMware and its inventors and on the patent the process is called “SCSI Protocol Emulation”.

Share this: