Example Architectural Decision – Network Failover Detection Policy

Problem Statement

What is the most suitable network failover detection policy to be used on the vSwitch or dvSwitch NIC team/s in an environment which uses IP storage and has only 2 physical NICs per vSwitch or dvSwitch?

Assumptions

1. vSphere 5.0 or greater
2. Storage is presented to the ESXi hosts is NFS via Multi Switch Link Aggregation
3. A maximum of 2 physical NICs exist per dvSwitch
4. Physical Switches support “Link state tracking”

Motivation

1. Ensure a reliable network failover detection solution
2. Ensure Multi switch link aggregation can be used for IP storage

Architectural Decision

Enable “Link state tracking” on the physical switches and Use “Link Status”

Justification

1. To work properly, Beacon Probing requires at least 3 NICs for “triangulation”  otherwise a failed link cannot be determined.
2.“Link state tracking” can be enabled on the physical switch to report upstream network failures where an “edge” & “core” network topology is used, therefore preventing the link status from being OK when traffic cannot reach the destination due to an upstream failure
3. Beacon Probing and the “route based on IP hash” network load balancing option is not compatible which prevents a single VMKernel being able to use multiple interfaces for IP storage traffic

Implications

1. Link state tracking needs to be supported and enabled on the physical switches

Alternatives

1. Use “Beacon Probing”