Example Architectural Decision – Network I/O Control Shares/Limits for ESXi Host using IP Storage

Problem Statement

With 10GB connections becoming the norm, ESXi hosts will generally have less physical connections than in the past where 1Gb was generally used, but more bandwidth per connection (and in total) than a host with 1GB NICs.

In this case, the hosts have only to 2 x 10GB NICs and the design needs to cater for all traffic (including IP storage) for the ESXi hosts.

The design needs to ensure all types of traffic have sufficient burst and sustained bandwidth for all traffic types without significantly negatively impacting other types of traffic.

How can this be achieved?

Assumptions

1. No additional Network cards (1gb or 10gb) can be supported
2. vSphere 5.1
3. Multi-NIC vMotion is desired

Constraints

1. Two (2) x 10GB NICs

Motivation

1. Ensure IP Storage (NFS) performance is optimal
2.Ensure vMotion activities (including a host entering maintenance mode) can be performed in a timely manner without impact to IP Storage or Fault Tolerance
3. Fault tolerance is a latency-sensitive traffic flow, so it is recommended to always set the corresponding resource-pool shares to a reasonably high relative value in the case of custom shares.
4. Proactively address potential contention due to limited physical network interfaces

Architectural Decision

Use one dvSwitch to support all VMKernel and virtual machine network traffic.

Enable Network I/O control, and configure NFS and/or iSCSI traffic with a share value of 100 and ESXi Management , vMotion & FT which will have share value of 25. Virtual Machine traffic will have a share value of 50.

Configure the two (2) VMKernel’s for IP Storage on dvSwitch and set to be Active on one 10GB interface and Standby on the second.

Configure two VMKernel interfaces for vMotion on the dvSwitch and set the first as Active on one interface and standby on the second.

A single VMKernel will be configured for Fault tolerance and will be configured as Active on one interface and standby on the second.

For ESXi Management, the VMKernel will be configured as Active on the interface where FT is standby and standby on the second interface.

All dvPortGroups for Virtual machine traffic will be active on both interfaces.

Justification

1. The share values were chosen to ensure IP storage traffic is not impacted as this can cause flow on effects for the environments performance. vMotion & FT are considered important, but during periods of contention, should not monopolize or impact IP storage traffic.
2. IP Storage is more critical to ongoing cluster and VM performance than ESXi Management, vMotion or FT
3. IP storage requires higher priority than vMotion which is more of a burst activity and is not as critical to VM performance
4. With a share value of 25,  Fault Tolerance still has ample bandwidth to support the maximum supported FT machines per host of 4 even during periods of contention
5. With a share value of 25, vMotion still has ample bandwidth to support multiple concurrent vMotion’s during contention however performance should not be impacted on a day to day basis. With up to 8 vMotion’s supported as it is configured on a 10GB interface. (Limit of 4 on a 1GB interface) Where no contention exists, vMotion traffic can burst and use a large percentage of both 10GB interfaces to complete vMotion activity as fast as possible
6. With a share value of 25,  ESXi Management still has ample bandwidth to continue normal operations even during periods of contention
7. When using bandwidth allocation, use “shares” instead of “limits,” as the former has greater flexibility for unused capacity redistribution.
8. With a share value of 50,  Virtual machine traffic still has ample bandwidth and should result in minimal or no impact to VM performance across 10Gb NICs
9. Setting Limits may prevent operations from completing in a timely manner where there is no contention

Implications

1. In the unlikely event of significant and ongoing contention, performance for vMotion may affect the ability to perform the evacuation of a host in a timely manner. This may extend scheduled maintenance windows.
2. VMs protected by FT may be impacted

Alternatives

1. Use a share value  of 50 for IP storage traffic to more evenly share bandwidth during periods of contention. However this may impact VM performance eg: Increased CPU WAIT if the IP storage is not keeping up with the storage demand

Related Posts
1. Example VMware vNetworking Design for IP Storage (4 x 10GB NICs)
2. Example VMware vNetworking Design for IP Storage (2 x 100GB NICs)
3. Frank Denneman (VCDX) – Designing your vMotion Network – Multi-NIC vMotion & NIOC

Example Architectural Decision – Virtual Switch Load Balancing Policy

Problem Statement

What is the most suitable network adapter load balancing policy to be configured on the vSwitch & dvSwitch/es where 10Gb adapters are being used for dvSwitches and 1Gb for vSwitch which is only used for ESXi management traffic?

Assumptions

1. vSphere 4.1 or later

Motivation

1. Ensure optimal performance and redundancy for the network
2. Simplify the solution without compromising performance for functionality

Architectural Decision

Use “Route based on physical NIC load” for Distributed Virtual switches and “Route based on originating port ID” for vSwitches.

Justification

1. Route based on physical NIC load achieves both availability and performance
2. Requires only a basic switch configuration (802.1q and the required VLANs tagged)
3. Where a single pNIC’s utilization exceeds 75% the “route based on physical NIC load” will dynamically balance workloads to ensure the best possible performance

Implications

1. If NFS IP storage is used with a single VMKernel it will not use both connections concurrently. If using multiple 10GB connections for NFS traffic is required then two or more VLANs should be created with one VMK per VLAN. If only one VMK is used, the only option if you want traffic to go down multiple uplinks would be to use “Route based on IP hash” and have Etherchannel configured on the physical switch.

Alternatives

1. Route based on the originating port ID

Pros: Chooses an uplink based on the virtual port where the traffic entered the virtual switch. The virtual machine outbound traffic is mapped to a specific physical NIC based on the ID of the virtual port to which this virtual machine is connected. This method is simple and fast, and does not require the VMkernel to examine the frame for necessary information.

Cons: When the load is distributed in the NIC team using the port-based method, no virtual machine single-NIC will ever get more bandwidth than can be provided by a single physical adapter.

2. Route based on IP hash.

Pros: Chooses an uplink based on a hash of the source and destination IP addresses of each packet. For non-IP packets, whatever is at those offsets is used to compute the hash. In this method, a NIC for each outbound packet is chosen based on its source and destination IP address. This method has a better distribution of traffic across physical NICs.

When the load is distributed in the NIC team using the IP-based method, a virtual machine single-NIC might use the bandwidth of multiple physical adapters.

Cons: This method has higher CPU overhead and is not compatible with all switches (it requires IEEE 802.3ad link aggregation support).

3. Route based on source MAC hash

Pros: Chooses an uplink based on a hash of the source Ethernet. This method is compatible with all physical switches. The virtual machine outbound traffic is mapped to a specific physical NIC based on the virtual NIC’s MAC address.

Cons: This method has low overhead, and might not spread traffic evenly across the physical NICs.

When the load is distributed in the NIC team using the MAC-based method, no virtual machine single-NIC will ever get more bandwidth than can be provided by a single physical adapter.

4. Use explicit fail-over order

Pros: Always uses the highest order uplink from the list of Active adapters which passes failover detection criteria.

Cons: This setting is equivalent to a fail over policy and is not strictly a load balancing policy.

5. Route based on Physical NIC load

Pros: Most efficient load balancing mechanism because it is base on the actual physical NIC workload.

Cons: Not available on standard vSwitches

For further information on the topic checkout the below two articles by a couple of very knowledgable VCDX’s

Michael Webster – Etherchanneling or Load based teaming?
Frank Denneman – IP Hash verses LBT