Example Architectural Decision – BC/DR Solution for vCloud Director

Problem Statement

What is the most suitable BC/DR solution for a vCloud director environment?

Requirements

1. Ensure the vCloud solution can tolerate a site failure in an automated manner
2. Ensure the vCloud solution meets/exceeds the RTO of 4hrs
3. Comply with all requirements of the Business Continuity Plan (BCP)
4. Solution must be a supported vSphere / vCloud Configuration
5. Ensure all features / functionality of the vCloud solution are available following a DR event

Assumptions

1. Datacenters are in an Active/Active configuration
2. Stretched Layer 2 network across both datacenters
3. Storage based replication between sites
4. vSphere 5.0 Enterprise Plus or later
5. VMware Site Recovery Manager 5.0 or later
6, vCloud Director 1.5 or later
7. There is no requirement for workloads proposed to be hosted in vCloud to be at one datacenter or another

Constraints

1. The hardware for the solution has already been chosen and purchased. 6 x 4 Way, 32 core Hosts w/ 512GB RAM and 4 x 10GB
2. The storage solution is already in place and does not support a Metro Storage Cluster (vMSC) configuration

Motivation

1. Meet/Exceed availability requirements
2. Minimize complexity

Architectural Decision

Use the vCloud DR solution as described in the “vCloud Director Infrastructure Resiliency Case Study” (By Duncan Epping @duncanyb and Chris Colotti @Ccolotti )

In Summary, Host the vSphere/vCloud Management virtual machines on an SRM protected cluster.

Use a dedicated cluster for vCloud compute resources.

Configure the vSphere cluster which is dedicated to providing compute resources to the vCloud environment (Provider virtual data center – PvDC) to have four (4) compute nodes  located at Datacenter A for production use and two (2) compute nodes located at Datacenter B (in ”Maintenance mode”) dedicated to DR.

Storage will not be stretched across sites; LUNs will be presented locally from “Datacenter A” shared storage to the “Datacenter A” based hosts. The “Datacenter A” storage will be replicated synchronously to “Datacenter B” and presented from “Datacenter B” shared storage to the two (2) “Datacenter B” based hosts. (No stretched Storage between sites)

In the event of a failure, SRM will recover the vSphere/vCloud Management virtual machines bringing back online the Cloud, then a script as the last part of the SRM recovery plan, Mounts the replicated storage to the ESXi hosts in “Datacenter B” and takes the two (2) hosts at “Datacenter B” out of maintenance mode. HA will then detect the virtual machines and power on them on.

Justification

1. Stretched Clusters are more suited to Disaster Avoidance than Disaster Recovery
2. Avoids complex and manual  intervention in the case of a disaster in the case of a stretched cluster solution
3. A Stretched cluster provides minimal control in the event of a Disaster where as in this case, HA simply restarts VMs once the storage is presented (automatically) and the hosts are taken out of Maintenance mode (also automated)
4. Having  two (2) ESXi hosts for the vCloud resource cluster setup in “Datacenter B” in “Maintenance Mode” and the storage mirrored as discussed  allows the virtual workloads to be recovered in an automated fashion as part of the VMware Site Recovery Manager solution.
5. Removes the management overhead of managing a strecthed cluster using features such as DRS affinity rules to keep VMs on the hosts on the same site as the storage
6. vSphere 5.1 backed resource clusters support >8 host clusters for “Fast provisioning”
7. Remove the dependency on the Metropolitan Area Data and Storage networks during BAU and the potential impact of the latency between sites on production workloads
8. Eliminates the chance of a “Split Brain” or a “Datacenter Partition” scenario where VM/s can be running at both sites without connectivity to each other
9. There is no specific requirement for non-disruptive mobility between sites
10. Latency between sites cannot be guaranteed to be <10ms end to end

Alternatives

1. Stretched Cluster between “Datacenter A” and “Datacenter B”
2. Two independent vCloud deployments with no automated DR
3. Have more/less hosts at the DR site in the same configuration

Implications

1. Two (2) ESXi hosts in the vCloud Cluster located in “Datacenter B” will remain unused as “Hot Standby” unless there is a declared site failure at “Datacenter A”
2. Requires two (2) vCenter servers , one (1) per Datacenter
3. There will be no non-disruptive mobility between sites (ie: vMotion)
4. SRM protection groups / plans need to be created/managed Note: This will be done as part of the Production cluster
5. In the event of a DR event, only half the compute resources will be available compared to production.
6. Depending on the latency between sites, storage performance may be reduced by the synchronous replication as the write will not be acknowledged to the VM at “Datacenter A” until committed to the storage at “Datacenter B”

CloudXClogo

 

 

Example Architectural Decision – Number of paths per LUN for VMFS datastores

Problem Statement

In a vSphere environment hosting a large number of VMs,  Virtual machines I/O requirements range from small <100 IOPS to large business critical applications with tens of thousands of IOPS, the ESXi hosts have been configured with 4 x 8Gb FC HBAs.

What is the most suitable number of paths per LUN when using 4 x 8GB FC connections per Host, and how will they be presented in a highly available manner with two (2) SAN Fabrics connected to an Active/Active Enterprise Disk array?

Requirements

1. All LUNs are available on all FC Interfaces
2. The storage be highly available
3. The environment should be able to continue running production workloads in the unlikely event of a dual port HBA, or single Fabric failure.
4. The environment maintain a consistent level of performance

Assumptions

1. The Storage area network has two (2) fabrics each of which is highly available
2. The disk system is presented to both SAN fabrics
3. The number of VMs per host is >100
4. vSphere 4.0 or later
5. Storage array is Active/Active
6. ESXi hosts are large and are designed to drive significant I/O
7. VAAI is supported and enabled

Constraints

1. Maximum paths supported per ESXi host is 1024
2. Maximum number of datastores per ESXi host is 256

Motivation

1. Ensure optimal performance redundancy
2. Maximum the total capacity able to be presented to a cluster

Architectural Decision

Use a standard of 8 paths per LUN

Each LUN will be presented to each HBA via both Controller A and Controller B resulting in two paths per LUN per HBA.

With a total of 4 FC connections across two (2) physical dual port HBAs in a HA configuration with one (1) connection per HBA per Fabric, this equates to a total of 8 paths per LUN to the ESXi host (4 paths per Fabric)

Justification

1. This equates to 4 paths (1 per HBA interface per LUN) per Fabric
2. The use of VMware NMP with “Round Robin” will be used and having all LUNs presented via both fabrics and all HBAs will provide the maximum reducing in latency and the most consistent performance overall
3. 8 paths per LUN ensures up to 128 LUNs can be presented within the 1024 paths per ESXi host limit which will support sufficient capacity for the cluster
4. The solution is highly available as it uses two fabrics and both controllers are Active
5. In the event of a Fabric failure, the remaining Fabric serving 2 x 8Gb connections will provide connectivity to both Controller A and B, with a total of 4 paths
6. Ensures the cluster can have enough LUNs to balance workloads across which will assist keeping latency at a minimum

Alternatives

1. Have less paths per LUN which enabled the use of more LUNs
2. Have more paths per LUN and have less LUNs

Implications

1. LUN sizes will need to be sizes to ensure a maximum of 128 LUNs are sufficient from a capacity perspective to cater for the desired number of virtual machines

vmware_logo_ads

Example Architectural Decision – Network I/O Control Shares/Limits for ESXi Host using IP Storage

Problem Statement

With 10GB connections becoming the norm, ESXi hosts will generally have less physical connections than in the past where 1Gb was generally used, but more bandwidth per connection (and in total) than a host with 1GB NICs.

In this case, the hosts have only to 2 x 10GB NICs and the design needs to cater for all traffic (including IP storage) for the ESXi hosts.

The design needs to ensure all types of traffic have sufficient burst and sustained bandwidth for all traffic types without significantly negatively impacting other types of traffic.

How can this be achieved?

Assumptions

1. No additional Network cards (1gb or 10gb) can be supported
2. vSphere 5.1
3. Multi-NIC vMotion is desired

Constraints

1. Two (2) x 10GB NICs

Motivation

1. Ensure IP Storage (NFS) performance is optimal
2.Ensure vMotion activities (including a host entering maintenance mode) can be performed in a timely manner without impact to IP Storage or Fault Tolerance
3. Fault tolerance is a latency-sensitive traffic flow, so it is recommended to always set the corresponding resource-pool shares to a reasonably high relative value in the case of custom shares.
4. Proactively address potential contention due to limited physical network interfaces

Architectural Decision

Use one dvSwitch to support all VMKernel and virtual machine network traffic.

Enable Network I/O control, and configure NFS and/or iSCSI traffic with a share value of 100 and ESXi Management , vMotion & FT which will have share value of 25. Virtual Machine traffic will have a share value of 50.

Configure the two (2) VMKernel’s for IP Storage on dvSwitch and set to be Active on one 10GB interface and Standby on the second.

Configure two VMKernel interfaces for vMotion on the dvSwitch and set the first as Active on one interface and standby on the second.

A single VMKernel will be configured for Fault tolerance and will be configured as Active on one interface and standby on the second.

For ESXi Management, the VMKernel will be configured as Active on the interface where FT is standby and standby on the second interface.

All dvPortGroups for Virtual machine traffic will be active on both interfaces.

Justification

1. The share values were chosen to ensure IP storage traffic is not impacted as this can cause flow on effects for the environments performance. vMotion & FT are considered important, but during periods of contention, should not monopolize or impact IP storage traffic.
2. IP Storage is more critical to ongoing cluster and VM performance than ESXi Management, vMotion or FT
3. IP storage requires higher priority than vMotion which is more of a burst activity and is not as critical to VM performance
4. With a share value of 25,  Fault Tolerance still has ample bandwidth to support the maximum supported FT machines per host of 4 even during periods of contention
5. With a share value of 25, vMotion still has ample bandwidth to support multiple concurrent vMotion’s during contention however performance should not be impacted on a day to day basis. With up to 8 vMotion’s supported as it is configured on a 10GB interface. (Limit of 4 on a 1GB interface) Where no contention exists, vMotion traffic can burst and use a large percentage of both 10GB interfaces to complete vMotion activity as fast as possible
6. With a share value of 25,  ESXi Management still has ample bandwidth to continue normal operations even during periods of contention
7. When using bandwidth allocation, use “shares” instead of “limits,” as the former has greater flexibility for unused capacity redistribution.
8. With a share value of 50,  Virtual machine traffic still has ample bandwidth and should result in minimal or no impact to VM performance across 10Gb NICs
9. Setting Limits may prevent operations from completing in a timely manner where there is no contention

Implications

1. In the unlikely event of significant and ongoing contention, performance for vMotion may affect the ability to perform the evacuation of a host in a timely manner. This may extend scheduled maintenance windows.
2. VMs protected by FT may be impacted

Alternatives

1. Use a share value  of 50 for IP storage traffic to more evenly share bandwidth during periods of contention. However this may impact VM performance eg: Increased CPU WAIT if the IP storage is not keeping up with the storage demand

Related Posts
1. Example VMware vNetworking Design for IP Storage (4 x 10GB NICs)
2. Example VMware vNetworking Design for IP Storage (2 x 100GB NICs)
3. Frank Denneman (VCDX) – Designing your vMotion Network – Multi-NIC vMotion & NIOC