Example Architectural Decision – Single Sign On Configuration for Single Site w/ Multiple vCenter Servers

Problem Statement

What is the most suitable deployment mode for vCenter Single-Sign On (SSO) in an environment where there is a single physical datacenter with multiple vCenter servers?

Requirements

1. The solution must be a fully supported configuration
2. Meet/Exceed RTO of 4 hours
3. Support Single Pane of glass management
4. Ability to scale for future vCenters and/or datacenters

Assumptions

1. All vCenter instances can access the same Authentication source (Active Directory or OpenLDAP)

2. The average number of authentications per second for each SSO instance is <30 (Configuration Maximum)

Constraints

1. vCenter servers reside in different network security zones within the datacenter

Motivation

1. Future proof the environment

Architectural Decision

1. Use “Multi-site” SSO deployment mode

2. Use one SSO instance per vCenter

3. Each SSO instance will reside with the vCenter on a Windows 2008 x64 R2 virtual machine in a vSphere cluster with HA enabled

4. Each SSO instance will use the bundled SQL database

5. (Optional) For greater availability, vCenter Heartbeat can be used to protect each SSO instance along with vCenter and the bundled SSO database

6. The Virtual Machine hosting vCenter/SSO will be 2vCPU and 10GB RAM to support vCenter/SSO/Inventory Service and an additional 2GB RAM to support the bundled SSO Database

7. Using the bundled SSO database ensures only a single vCenter Heartbeat deployment is required to protect each vCenter/SSO instance and reduce Windows licensing

Justification

1. To simplify the maintenance/upgrade process for vCenter/SSO as different versions of vCenter cannot co-exist with the same SSO instance

2. If “High Availability” mode is used it would prevent single pane of glass management

3. “High Availability” mode currently requires an SSL load balancer to be configured as well as manual intervention which can be complicated and problematic to implement and support

4. “Basic” mode prevents the use of Linked Mode which will prevent the management of the environment being single pane of glass

5. Where vCenter servers reside in different network security zones, Using Multi-site mode allows each SSO instance to use authentication sources that are as logically close as possible while supporting single pane of glass management. This should provide faster access to authentication services as each SSO instance is configured with Active Directory servers located in the same or logically closest network security zone/s.

6. If one instance SSO goes offline for any reason, it will only impact a single vCenter server. It will not prevent authentication to the other vCenter servers.

7. Reduce the licensing costs for Microsoft Windows 2008 by combining SSO and vCenter roles onto a single OS

Alternatives

1. Use “Basic” Mode, resulting in a standalone version of SSO for each vCenter server with no single pane of glass management

2. Use “High Availability” mode per vCenter

3. Use a shared “High Availability” mode for all vCenters in the datacenter

4. In any SSO configuration, Host the SSO database (per vCenter) on a Oracle OR SQL Server

5. Run SSO on a dedicated Windows 2008 instance with or without the SSO database locally

6. Run a single SSO instance in “Multi-Site” mode , use vCenter Heartbeat to protect SSO (including the database) and share the SSO instance with all vCenters

Implications

1. Where SSO is not protected by vCenter Heartbeat (optional), SSO for each vCenter is a Single point of failure where authentication to the affected vCenter will fail

2. “Multi-Site” mode requires the install-able version of SSO, which is Windows Only which prevents the use of the vCenter Server Appliance (VCSA) as it only supports basic mode.

Related Articles

1. vSphere 5.1 Single Sign On (SSO) deployment mode across Active/Active Datacenters

2. vSphere 5.1 Single Sign On (SSO) Architectural Decision Flowchart

3. Disabling Single Sign On – Dont Do It! – Michael Webster (VCDX#66) @vcdxnz001

CloudXClogo

 

 

Example Architectural Decision – VMware HA – Percentage of Cluster resources reserved for HA

Problem Statement

The decision has been made to use “Percentage of cluster resources reserved for HA” admission control setting, and use Strict admission control to ensure the N+1 minimum redundancy level is maintained. However, as most virtual machines do not use  “Reservations” for CPU and/or Memory, the default reservation is only 32Mhz and 0MB+overhead for a virtual machine. In the event of a failure, this level of resources is unlikely to provide sufficient compute to operate production workloads. How can the environment be configured to ensure a minimum level of performance is guaranteed in the event of one or more host failures?

Requirements

1. All Clusters have a minimum requirement of N+1 redundancy
2. In the event of a host failure, a minimum level of performance must be guaranteed

Assumptions

1. vSphere 5.0 or later (Note: This is Significant as default reservation dropped from 256Mhz to 32Mhz, RAM remained at 0MB + overhead)

2. Percentage of Cluster resources reserved for HA is used and set to a value as per Example Architectural Decision – High Availability Admission Control

3. Strict admission control is enabled

4. Target over commitment Ratios are <=4:1 vCPU / Physical Cores | <=1.5 : 1 vRAM / Physical RAM

5. Physical CPU Core speed is >=2.0Ghz

6. Virtual machines sizes in the cluster will vary

7. A limited number of mission critical virtual machines may be set with reservations

8. Average VM size uses >2GB RAM

9. Clusters compute resources will be utilized at >=50%

Constraints

1. Ensuring all compute requirements are provided to Virtual machines during BAU

Motivation

1. Meet/Exceed availability requirements
2. Minimize complexity
3. Ensure the target availability and performance is maintained without significantly compromising  over commitment ratios

Architectural Decision

Ensure all clusters remain configured with the HA admission control setting use
“Enable – Do not power on virtual machines that violate availability constraints”

and

Use “Percentage of Cluster resources reserved for HA” for the admission control policy with the percentage value based on the following Architectural Decision – High Availability Admission Control

Configure the following HA Advanced Settings

1. “das.vmMemoryMinMB” with a value of “1024″
2. “das.vmCpuMinMHz” with a value of “512”

Justification

1. Enabling admission control is critical to ensure the required level of availability.
2. The “Percentage of cluster resources reserved for HA” setting allows a suitable percentage value of cluster resources to reserved depending on the size of each cluster to maintain N+1
3.The potentially inefficient slot size calculation used with “Host Failures cluster tolerates” does not suit clusters where virtual machines sizes vary and/or where some mission Critical VMs require reservations

  • 4.
  • Using advanced settings “das.vmCpuMinMHz” & “das.vmMemoryMinMB” allows a minimum level of performance (per VM) to be guaranteed in the event of one or more host failures
  • 5.
  • Advanced settings have been configured to ensure the target over commit ratios are still achieved while ensuring a minimum level of resources in a the event of a host failure
  • 6.
  • Maintains an acceptable minimum level of performance in the event of a host failure without requiring the administrative overhead of setting and maintaining “reservations” at the Virtual machine level
  • 7.
  • Where no reservations are used, and advanced settings not configured, the default reservation would be 32Mhz and 0MB+ memory overhead is used. This would likely result in degraded performance in the event a host failure occurs.

Alternatives

1. Use “Specify a fail over host” and have one or more hosts specified
2. “Host Failures cluster tolerates” and set it to appropriate value depending on hosts per cluster without using advanced settings
3.Use higher Percentage values
4. Use Higher / Lower values for “das.vmMemoryMinMB” and “das.vmCpuMinMHz”
5. Set Virtual machine level reservations on all VMs

Implications

1. The “das.vmCpuMinMHz” advanced setting applies on a per VM basis, not a per vCPU basis, so VMs with multiple vCPUs will still only be guarenteed 512Mhz in a HA event

2. This will reduce the number of virtual machines that can be powered on within the cluster (in order to enforce the HA requirements)

CloudXClogo