Storage I/O Control (SIOC) configuration for Nutanix

Storage I/O Control (SIOC) is a feature introduced in VMware vSphere 4.1 which was designed to allow prioritization of storage resources during periods of contention across a vSphere cluster. This situation has often been described as the “Noisy Neighbor” issue where one or more VMs can have a negative impact on other VMs sharing the same underlying infrastructure.

For traditional centralized shared storage, enabling SIOC is a “No Brainer” as even using the default settings will ensure more Consistent performance during periods of storage contention which all but no downsides. SIOC does this by managing and potentially throttling the Device Queue depth based on “Shares” assigned to each Virtual Machine to ensure Consistent performance across ESXi hosts.

The below diagrams show the impact on three (3) identical VMs with the same Disk “Shares” values with and without SIOC in a traditional centralized storage environment (a.k.a SAN/NAS).

Without Storage I/O Control

NutanixWithoutSIOC

With Storage I/O Control

NutanixWithSIOC

 

As show in the above, where VMs have equal share values but reside on different ESXi hosts can result in an undesired result with one VM having double the available storage queue compared to the VMs residing on a different host.  In comparison, SIOC ensuring VMs with the same share value get equal access to the underlying storage queue.

While SIOC is an excellent feature, it was designed to address a problem which is no longer a significant factor with the Nutanix Scale out Shared nothing style architecture.

The issue of “noisy neighbour” or storage contention in the Storage Area Network (SAN) is all but eliminated as all Datastores (or “Containers” in Nutanix speak) are serviced by every Nutanix Controller VM in the cluster and under normal circumstances, upwards on 95% of read I/O is serviced by the local Controller VM, Nutanix refers to this feature as “Data Locality”.

Data Locality ensures data being written and read by a VM remains on the Nutanix node where the VM is running, thus reducing latency of accessing data across a Storage Area Network, and ensuring that a VM reading data on one node, has minimal or no impact on another VM on another node in the cluster.

As Write I/O is also distributed throughout the Nutanix cluster, which means no one single node is monopolized by (Write) replication traffic.

Storage I/O Control was designed around the concept of a LUN or NFS mount (from vSphere 5.0 onwards) where the LUN or NFS mount is served by a central storage controller, as is the most typical deployment in the past for VMware vSphere environment.

As such, SIOC limiting the LUN queue depth allowed all VMs on the LUN to have either an equal share of the available queue, OR by specifying “Share” values on a VM basis, ensure VMs can be prioritized based on importance.

By default, all Virtual Hard Disks have a share value of “Normal” (1000 shares). Therefore if a individual VM needs to be given higher storage priority, the Share value can be increased.

Note: In general, modifying VM Virtual disk share values should not be required.

As Nutanix has one Storage Controller (or CVM) per node which all actively service I/O to the Datastore, SOIC is not required, and provides no benefits.

For more information about SIOC in traditional NAS environments see: “Performance implications of Storage I/O Control – Enabled NFS Datastores in VMware vSphere 5.0

As such for Nutanix environments it is recommended SIOC control be disabled and DRS “anti-affinity” or “VM to host” rules be used to separate high I/O VMs.

Example Architectural Decision – Datastore (LUN) Sizing with Block Based Storage

Problem Statement

In a vSphere environment, What is the most suitable Datastore (LUN) sizing to use for to support both production & development workloads to ensure minimum storage overhead and optimal performance?

Requirements

1. RTO 4hrs
2. RPO 12hrs
3. Support Production and Test & Development Workloads
4. Ensure optimal storage capacity utilization
5. Ensure storage performance is both consistent & maximized
6. Ensure the solution is fully supported
7. Minimize BAU effort (Monitoring)

Assumptions

1. Business critical applications are excluded
2. Block based storage
3. VAAI is supported and enabled
4. VADP backups are being utilized
5. vSphere 5.0 or later
6. Storage DRS will not be used
7. SRM is in use
8. LUNs & VMs will be thin provisioned
9. Average size VM will be 100GB and be 50% utilized
10. Virtual machine snapshot will be used but not for > 24 hours
11. Change rate of average VM is <= 15% per 24 hour period
12. Average VM has 4GB Ram
13. No Memory reservations are being used
14. Storage I/O Control (SOIC) is not being used
15. Under normal circumstances storage will not be over committed at the storage array level.
16. The average maximum IOPS per VMs is 125 (16Kb) (MBps per VM <=2)
17. The underlying storage has sufficient performance to cater for the average maximum IOPS per VM
18. A separate swap file datastore will be configured per cluster

Constraints

1. Must used existing storage solution (Block Based Storage)

Motivation

1. Increase flexibility
2. Ensure physical disk space is not unnecessarily wasted
3. Create a Scalable solution
4. Ensure high performance
5. Ensure high utilization of storage resources by reducing “islands” of unused capacity
6. Provide flexibility in the unit size of partial SRM failovers

Architectural Decision

The standard datastore size will be 3TB and contain up to 25 standard virtual machines.

This is based on the following

25 VMs per datastore X 100GB (Assumes no over commitment) = 2500GB

25 VMs w/ 4GB RAM = 100GB minus 0Gb reservation = 100GB vswap space to be stored on the swap file datastore

25 VMs w/ Snapshots of up to 15% =  375GB

Total = 2500GB + 375GB = 2875GB

Average capacity used per VM = 115GB

Justification

1. In worst case scenario where every VM has used 100% of its VMDK capacity and has 4GB RAM with no memory reservation and a snapshot of up to 15% of its size the 3TB datastore will still have 197GB remaining, as such it will not run out of space.
2. The Queue depth is on a per datastore (LUN) basis, as such, having 25 VMs per LUNs allows for a minimum of 1.28 concurrent I/O operations per VM based on the standard queue depth of 32 although it is unlikely all VMs will have concurrent I/O so the average will be much higher.
3. Thin Provisioning minimizes the impact of situations where customers demand a lot of disk space up front when they only end up using a small portion of the available disk space
4. Using Thin provisioning for VMs increases flexibility as all unused capacity of virtual machines remains available on the Datastore (LUN).
5. VAAI automatically raises an alarm in vSphere if a Thin Provisioned datastore usage is at >= 75% of its capacity
6. The impact of SCSI reservations causing performance issues (increased latency) when thin provisioned virtual machines (VMDKs) grow is unlikely to be an issue for 25 low I/O VMs and with VAAI is no longer an issue as the Atomic Test & Set (ATS) primitive alleviates the issue of SCSI reservations.
7. As the VMs are low I/O it is unlikely that there will be any significant contention for the queue depth with only 25 VMs per datastore
8. The VAAI UNMAP primitive provides automated space reclamation to reduce wasted space from files or VMs being deleted
9. Virtual machines will be Thin provisioned for flexibility, however they can also be made Thick provisioned as the sizing of the datastore (LUN) caters for worst case scenario of 100% utilization while maintaining free space.
10. Having <=25 VMs per datastore (LUN) allows for more granular SRM fail-over (datastore groups)

Alternatives

1.  Use larger Datastores (LUNs) with more VMs per datastore
2.  Use smaller Datastores (LUNs) with less VMs per datastore

Implications

1. When performing a SRM fail over, the most granular fail over unit is a single datastore which may contain up to 25 Virtual machines.

2. The solution (day 1) does not provide CapEx saving on disk capacity but will allow (if desired) over commitment in the future

Thanks to James Wirth (VCDX#83) @JimmyWally81 for his contributions to this example decision.

Related Articles

1. Datastore (LUN) and Virtual Disk Provisioning (Thin on Thick)

2. Datastore (LUN) and Virtual Disk Provisioning (Thin on Thin)

3. Virtual Machine vSwap Location

CloudXClogo

 

Example Architectural Decision – Storage I/O Control for Clusters Protected by SRM (Example 2 – Use SIOC)

Problem Statement

In an environment with one or more clusters with virtual machines protected by SRM, What is the most appropriate configuration of Storage I/O control?

Requirements

1. SRM solution must not be impacted

Assumptions

1. vSphere Version 4.1 or later

2. FC (Block) Based Storage OR NFS (File) based Storage

3. Number of datastores is fairly static

Constraints

1. Storage I/O control can prevent unmounting of datastore during a Recovery which can lead to errors being reported by SRM

Motivation

1. Where possible ensure consistent storage performance for all virtual machines

Architectural Decision

Enable and Configure Storage I/O control for all datastores.

Set the congestion threshold to 20ms

Leave the shares value default

Add a Step to each SRM recovery Plan as Step 1 and Select the Step Placement of “Before selected step”.

Configure step type as “Command of SRM Server” and execute the Scheduled Task which will disable SIOC prior to executing a SRM recovery

Justification

1. The benefits of Storage I/O control can still be achieved without impact to the SRM solution

2. SIOC will not impact SRM failover as it will be disabled automatically as part of the SRM recovery plan

3. In the event the Protected site or is lost, SIOC will not prevent failover

Implications

1. Increased complexity for the SRM solution

2. An additional step to excecute a “Command of SRM Server” is required

3. A Scheduled Task will need to be setup and configured with setting “Allow task to be ran on demand”

4. A script to disable SIOC will need to be prepared and configured with all datastores

Alternatives

1. Enable Storage I/O control and leave default settings

2. Enable storage I/O control and set share values on virtual machines

3. Enable Storage I/O control and set a lower “congestion threshold”

4. Enable Storage I/O control and set a higher “congestion threshold”

5. Disable Storage I/O control

Relates Articles

1. Example Architectural Decision –  Storage I/O Control for Clusters Protected by SRM (Example 2 – Don’t Use SIOC)