Example Architectural Decision Competition by VMware Press & Josh Odgers

VMwarePressLogo

Welcome to the Example Architectural Decision Competition!

VMware Press is conjunction with JoshOdgers.com (CloudXC) wish to announced this competition to find the most innovative and creative virtualization related architectural decisions to real world problems.

All submissions will be posted in this special section of JoshOdgers.com (CloudXC) with the goal to encourage everyone to share their experiences for the benefit of the Virtualization community.

All suitable example architectural decisions submitted to this competition will remain featured on this blog following the competition with credit being given to the author.

The competition will initially run for the next six (6) weeks and depending on the popularity of the competition it may be extended.

The Winner will be announced Fortnightly and will receive a printed copy of the VMware Press title of their choice.

The runner up will receive a voucher for a VMware Press eBook.

You can see the range of books VMware Press offer here.

If any other vendors wish to contribute prizes to this competition please add a comment to this page or contact me via Twitter (@josh_odgers).

The format of all example architectural decisions submissions must be as follows. Any submission without details for the following categories will be ineligible.

Problem Statement

Describe the problem statement or goal of the situation the design decision relates too

Assumptions

1. Assumption 1
2. Assumption 2
3. Assumption 3

Constraints

1. Constraint 1
2. Constraint 2
3. Constraint 3

Motivation

1. Motivation 1
2. Motivation 2

Architectural Decision

Details of Architectural Decision

Alternatives

1. Alternative 1
2.  Alternative 2
3.  Alternative 3

Justification

1. Justification 1
2. Justification 2
3. Justification 3
4. Justification 4
5. Justification 5

Implications

1. Implication 1
2. Implication 2

Example Architectural Decisions can be submitted via the following form.

Note: Limit of 3 submissions per person, per fortnight.

Winners will be announced on this blog and via Twitter on the following dates

October 17th
October 31st
November 14th

Good Luck!

COMPETITION ENDED.

Example Architectural Decision Competition – Submissions

All suitable Example architectural decision submissions will be posted here, please vote for your favourite decision by leaving a comment on this page with the example decision number.

SUBMISSIONS FOR ROUND 1 (Closed!)

1. TSM backup configuration for PureFlex environment?

2. Use of RDMs in Standard IaaS Clusters

3. Scalable network architecture for VXLAN

4. vCloud Allocation Pool Usable Memory

5. New vSphere 5.x environment

6. Improve Performance for BCAs on Cisco UCS

7. (More Coming Soon)

WINNER ROUND ONE: Use of RDMs in Standard IaaS Clusters by Chris Jones @cpjones44

This design decision works around some fairly strict constraints, such as no >2TB LUNs, no IP based storage & the inability for monitoring solution to be customized.

While the decision is ultimately fairly straight forward, the decision documents the issue well and justifies the decision and discusses in depth the implications of the decision.

This is an example of a fairly obvious decision (considering the constraints) but shows even where a decision may be obvious, or the only option, that understanding the implications is important. Documenting even obvious decisions is also important so in the event of movement within the team, the solution can be understood by people not involved in the original design process.

RUNNER UP ROUND ONE: TSM backup configuration for PureFlex environment? By Ash Simpson @Yipikaye1

Not unlike Chris Jones’ decision, Ash’s submission works within the constraints of an existing environment, where hardware and software has already been purchased. This is a common issue, where Hardware / Software is purchased before a detailed design phase. This is a huge problem in the industry and I encourage you all to ensure this trend does not continue. Without a detailed design phase, it is not possible to confirm what hardware/software is required, as such hardware/software should only be purchased after the design to completed.

Again this decision is fairly obvious given the constraints, but the decision explains the benefits of this method of configuration and discusses the implications which is important.

The constraints did not list anything preventing purchasing of a different backup solution, although this is somewhat implied by the assumptions.

Congratulations to Chris Jones @cpjones44 & Ash Simpson @Yipikaye1!

Thank you to everyone who submitted design decisions, and I encourage you all to submit new decisions for Round 2 and am looking forward to new competition participants.

SUBMISSIONS FOR ROUND 2 (Closing 31st October 2013)

1. (More Coming Soon)

2. (More Coming Soon)

3. (More Coming Soon)

 

Example Architectural Decision – ESXi Host Hardware Sizing (Example 1)

Problem Statement

What is the most suitable hardware specifications for this environments ESXi hosts?

Requirements

1. Support Virtual Machines of up to 16 vCPUs and 256GB RAM
2. Achieve up to 400% CPU overcommitment
3. Achieve up to 150% RAM overcommitment
4. Ensure cluster performance is both consistent & maximized
5. Support IP based storage (NFS & iSCSI)
6. The average VM size is 1vCPU / 4GB RAM
7. Cluster must support approx 1000 average size Virtual machines day 1
8. The solution should be scalable beyond 1000 VMs (Future-Proofing)
9. N+2 redundancy

Assumptions

1. vSphere 5.0 or later
2. vSphere Enterprise Plus licensing (to support Network I/O Control)
3. VMs range from Business Critical Application (BCAs) to non critical servers
4. Software licensing for applications being hosted in the environment are based on per vCPU OR per host where DRS “Must” rules can be used to isolate VMs to licensed ESXi hosts

Constraints

1. None

Motivation

1. Create a Scalable solution
2. Ensure high performance
3. Minimize HA overhead
4. Maximize flexibility

Architectural Decision

Use Two Socket Servers w/ >= 8 cores per socket with HT support (16 physical cores / 32 logical cores) , 256GB Ram , 2 x 10GB NICs

Justification

1. Two socket 8 core (or greater) CPUs with Hyper threading will provide flexibility for CPU scheduling of large numbers of diverse (vCPU sized) VMs to minimize CPU Ready (contention)

2. Using Two Socket servers of the proposed specification will support the required 1000 average sized VMs with 18 hosts with 11% reserved for HA to meet the required N+2 redundancy.

3. A cluster size of 18 hosts will deliver excellent cluster (DRS) efficiency / flexibility with minimal overhead for HA (Only 11%) thus ensuring cluster performance is both consistent & maximized.

4. The cluster can be expanded with up to 14 more hosts (to the 32 host cluster limit) in the event the average VM size is greater than anticipated or the customer experiences growth

5. Having 2 x 10GB connections should comfortably support the IP Storage / vMotion / FT and network data with minimal possibility of contention. In the event of contention Network I/O Control will be configured to minimize any impact (see Example VMware vNetworking Design w/ 2 x 10GB NICs)

6. RAM is one of the most common bottlenecks in a virtual environment, with 16 physical cores and 256GB RAM this equates to 16GB of RAM per physical core. For the average sized VM (1vCPU / 4GB RAM) this meets the CPU overcommitment target (up to 400%) with no RAM overcommitment to minimize the chance of RAM becoming the bottleneck

7. In the event of a host failure, the number of Virtual machines impacted will be up to 64 (based on the assumed average size VM) which is minimal when compared to a Four Socket ESXi host which would see 128 VMs impacted by a single host outage

8. If using Four socket ESXi hosts the cluster size would be approx 10 hosts and would require 20% of cluster resources would have to be reserved for HA to meet the N+2 redundancy requirement. This cluster size is less efficient from a DRS perspective and the HA overhead would equate to higher CapEx and as a result lower the ROI

9. The solution supports Virtual machines of up to 16 vCPUs and 256GB RAM although this size VM would be discouraged in favour of a scale out approach (where possible)

10. The cluster aligns with a virtualization friendly “Scale out” methodology

11. Using smaller hosts (either single socket, or less cores per socket) would not meet the requirement to support supports Virtual machines of up to 16 vCPUs and 256GB RAM , would likely require multiple clusters and require additional 10GB and 1GB cabling as compared to the Two Socket configuration

12. The two socket configuration allows the cluster to be scaled (expanded) at a very granular level (if required) to reduce CapEx expenditure and minimize waste/unused cluster capacity by adding larger hosts

13. Enabling features such as Distributed Power Management (DPM) are more attractive and lower risk for larger clusters and may result in lower environmental costs (ie: Power / Cooling)

Alternatives

1.  Use Four Socket Servers w/ >= 8 cores per socket , 512GB Ram , 4 x 10GB NICs
2.  Use Single Socket Servers w/ >= 8 cores , 128GB Ram , 2 x 10GB NICs
3. Use Two Socket Servers w/ >= 8 cores , 512GB Ram , 2 x 10GB NICs
4. Use Two Socket Servers w/ >= 8 cores , 384GB Ram , 2 x 10GB NICs
5. Have two clusters of 9 hosts with the recommended hardware specifications

Implications

1. Additional IP addresses for ESXi Management, vMotion, FT & Out of band management will be required as compared to a solution using larger hosts

2. Additional out of band management cabling will be required as compared to a solution using larger hosts

Related Articles

1. Example Architectural Decision – Network I/O Control for ESXi Host using IP Storage (4 x 10 GB NICs)

2. Example VMware vNetworking Design w/ 2 x 10GB NICs

3. Network I/O Control Shares/Limits for ESXi Host using IP Storage

4. VMware Clusters – Scale up for Scale out?

5. Jumbo Frames for IP Storage (Do not use Jumbo Frames)

6. Jumbo Frames for IP Storage (Use Jumbo Frames)

CloudXClogo