The balancing act of choosing a solution based on Requirements & Budget Constraints

I am frequently asked how to architect solutions when you have constraint/s which prevent you meeting requirements. It is not unusual for customers to have an expectation that they need and can get the equivalent of a Porsche 911 turbo for the price of a Toyota Corolla.

It’s also common for less experienced architects to focus straight away on budget constraints before going through a reasonable design phase and addressing requirements.

If you are constrained to the point you cannot afford a solution that comes close to meeting your requirements, the simple fact is, that customer is in pretty serious trouble no manner how you look at it. But this is rarely the case in my experience.

Typically customers simply need to sit down with an experienced architect and go through what business outcomes they want to achieve. Then the architect will ask a range of questions to help clarify the business goals and translate those into clearly defined Requirements.

I also find customers frequently think they need (or have been convinced by a vendor) that they need more than what they do to achieve the outcomes, e.g.: Thinking you need Active/Active datacenters with redundant 40Gb WAN links when all you need is VM High Availability and async rep to DR on 1Gb WAN links.

So my point here is always start with the following:

  • Step 1: What is the business problem/s the customer is trying to solve.

Until the customer, VAR , Solution Architect and vendor/s understand this and it is clearly documented, do not proceed any further. Without this information, and a clear record of what needs to be achieved, the project will likely fail.

At this stage its important to also understand any constraints such as Project Timelines, CAPEX budget & OPEX budget.

  • Step 2: Research potential solutions

Once you have a clear understanding of the problem, desired outcome / requirements, then its time for you to research (or engage a VAR to do this on your behalf) what potential products could provide a solution that addresses the business outcome, requirements etc.

  • Step 3: Provide VAR and/or Vendors detailed business problem/s, requirements and constraints.

This is where the customer needs to take some responsibility. A VAR or Vendor who is kept in the dark is unlikely to be able to deliver anything close to the desired outcome. While customers don’t like to give out information such as budget, without it, it just creates more work for everyone, and ultimately drives up the cost to the VAR/vendor which in turn gets passed onto customers.

With a detailed understanding of the desired business outcomes, requirements and constraints the VAR/vendor should provide a high level indicative solution proposal with details on CAPEX and ideally OPEX to show a TCO.

In many cases the lowest CAPEX solution has the highest OPEX which can mean a significantly higher TCO.

cheaper

  • Step 4: Customer evaluates High level Solution Proposal/s

The point here is to validate if the proposed solution will provide the desired business outcome and if it can do so within the budget. At this stage it is importaint to understand how the solution meets/exceeds each requirement and being able to trace a design back to the business outcomes. This is why it’s critical to document the requirements so the high level design (and future detailed design) can address each criteria.

If the proposed solution/s do meet/exceed the business requirements, then the question is, Do they fall within the allocated budget?

If so, great! Choose your preferred vendor solution and proceed.

If not, then a proposal needs to be put to the business for additional budget. If that is approved, again great and you can proceed.

If additional CAPEX/OPEX cannot be obtained then this is where the balancing act really gets interesting and an experienced architect will be of great value.

  • Step 5: Reviewing Business outcomes/requirements (and prioritise requirements!)

So lets say we have three requirements, R001, R002 and R003. All are importaint, but the simple fact at this point is the budget is insufficient to deliver them all.

This is where a good solution architect sets the customers expectations as a trusted adviser. The expectation needs to be clearly set that the budget is insufficient to deliver all the desired business outcomes and the priorities need to be set.

Sit with the customer and put all requirements into priority order, then its back and forth with the vendors to provide a lower cost (CAPEX/OPEX or both) detailing the requirement which have and have not been met and any/all implications.

In my opinion the key in these situations is not to just “buy the best you can” but to document in detail what can and cannot be achieved with detailed explanations on the implications. For example, an implication might be the RPO goes from 1hr to 8hrs and the RTO for a business critical application extended from 1hr to 4hrs. In day to day operations the customer would likely not know the difference, but if/when an outage occurred, they would know and need to be prepared. The cost of a single outage can in many cases cost much more than the solution itself, which is why its critical to document everything for the customer.

In my experience, where the implications are significant and clearly understood by the customer (at both a technical and business level), it is not uncommon for customers to revisit the budget and come up with additional funds for the project. It is the job of the Solution Architect (who should always be the customer advocate) to ensure the customer understands what the solution can/cannot deliver.

Where the customer understands the implications (a.k.a Risks), it is importaint that they sign off on the risks prior to going with a solution that does not meet all the desired outcomes. The risks should be clearly documented ideally with examples of what could happen in situations such as failure scenarios.

Basic Flowchart of the above described process:

The below is a basic flowchart showing a simplified process of gathering business requirements/outcomes through to an outcome.

Starting in the top left, we have the question “Are the business problem/s you are trying to solve and the requirements documented?

From there its a follow the bouncing ball until we come to the dreaded “Is the solution/s within the budget constraints?”. This area is coloured grey for a reason, as this is where the balancing act occurs and delays can happen as indicated with the following shape DelayIconGreay.

Once the customer fully understand what they are or are not getting AND IT IS FULLY DOCUMENTED, then and only then should a customer (or VAR/Architect recommend too) proceed with a proposal.

Flowchart01

The above is not an exhausting process, but just something to inspire some thought next time you or your customers are constrained by budget.

Summary:

  • Don’t just “Buy what you can afford and hope for the best”
  • Document how requirements are (or are not) being met (That’s “Traceability” for all you VCDX/NPX candidates)
  • Document any risks/implications and mitigation strategies
  • Get sign off on any requirements which is not met
  • Always provide a customer with options to choose from, don’t assume they wont invest more to address risks to the business
  • If you can’t meet all requirements Day 1, ensure the design is scalable. A.k.a Start small and scale if/when required.
  • Avoid low cost solutions that will likely end up having to be thrown out

Related Articles:

Why Nutanix Acropolis hypervisor (AHV) is the next generation hypervisor – Part 1 – Introduction

Before I go into the details of why Acropolis Hypervisor (AHV) is the next generation of hypervisor, I wanted to quickly cover what the Xtreme Computing Platform is made up of and clarify the product names which will be discussed in this series.

In the below picture we can see Prism which is a HTML 5 based user interface sits on top of Acropolis which is a Distributed Storage and Application Mobility across multi-hypervisors and public clouds.

At the bottom we can see the currently support hardware platforms from Supermicro and Dell (OEM) but recently Nutanix has announced an OEM with Lenovo which expands customer choice further.

Please do not confuse Acropolis with Acropolis Hypervisor (AHV) as these are two different components, Acropolis is the platform which can run vSphere, Hyper-V and/or the Acropolis Hypervisor which will be referred to in this series as AHV.
nutanixxcp2

I want to be clear before I get into the list of why AHV is the next generation hypervisor that Nutanix is a hypervisor and cloud agnostic platform designed to give customers flexibility & choice.

The goal of this series is not trying to convince customers who are happy with their current environment/s to change hypervisors.

The goal is simple, to educate current and prospective customers (as well as the broader market) about some of the advantages / values of AHV which is one of the hypervisors (Hyper-V, ESXi and AHV) supported on the Nutanix XCP.

Here are my list of reasons as to why the Nutanix Xtreme Computing Platform based on AHV is the next generation hypervisor/management platform and why you should consider the Nutanix Xtreme Computing Platform (with Acropolis Hypervisor a.k.a AHV) as the standard platform for your datacenter.

Why Nutanix Acropolis hypervisor (AHV) is the next generation hypervisor

Part 2 – Simplicity
Part 3 – Scalability
Part 4 – Security
Part 5 – Resiliency
Part 6 – Performance
Part 7 – Agility (Time to Value)
Part 8 – Analytics (Performance & Capacity Management)
Part 9 – Functionality (Coming Soon)
Part 10 – Cost

NOTE:  For a high level summary of this series, please see the accompanying post by Steve Kaplan, VP of Client Strategy at Nutanix (@ROIdude)

MS Exchange on Nutanix Acropolis Hypervisor (AHV)

While Virtualization of MS Exchange is now common across multiple hypervisors it continues to be a hotly debated topic. The most common objections being cost (CAPEX), the next being complexity (which translates to CAPEX & OPEX) and the third being that virtualization adds minimal value as MS Exchange provides application level high availability. The other objection I hear is Virtualization isn’t supported, which always makes me laugh.

In my experience, the above objections are typically given in the context of a dedicated MS Exchange environment, which in that specific context some of the points have some truth, but the question becomes, how many customers run only MS Exchange? In my experience, None.

Customers I see typically run tens, hundreds even thousands of workloads in their datacenters so architecting silos for each application is what actually leads to cost & complexity when we think outside the box.

Since most customers have virtualization and want to remove silos in favour of a standarized platform, MS Exchange is just another Business Critical Application which needs to be considered.

Let’s discuss each of the common objections and how I believe Acropolis + Nutanix XCP addresses these challenges:

Microsoft Support for Virtualization

For some reason, there is a huge amount of FUD regarding Microsoft support for Virtualization (other than Hyper-V), but Nutanix + Acropolis is certified under the Microsoft Server Virtualization Validation Program (SVVP) and runs on block storage via iSCSI protocol, so Nutanix + Acropolis is 100% supported for MS Exchange as well as other workloads like Sharepoint & SQL.

Cost (CAPEX)

Unlike other hypervisors and management solutions, Acropolis and Acropolis Hypervisor (AHV) come free with every Nutanix node which eliminates the licensing cost for the virtualization layer.

Acropolis management components also do not require purchase or installation of Tier 1 database platforms, all required management components are built into the distributed platform and scaled automatically as clusters are expanded. As a result, even licenses for Windows operating system are not required.

As a result, Nutanix + Acropolis gives Exchange deployments all the Virtualization features (below) which provide benefits at no cost.

  • High Availability & Live Migration
  • Hardware abstraction
  • Performance monitoring
  • Centralized management

Complexity (CAPEX & OPEX)

Nutanix XCP + Acropolis can be deployed in a fully optimal configuration from out of the box to operational in less than 60 minutes. This includes all required management components which are automatically deployed as part of the Nutanix Controller VM (CVM). For single cluster environments, no design/installation is required for any management components, and for multiple-cluster environments, only a single virtual appliance (PRISM Central) is required for single pane of glass management across all clusters.

Acropolis gives Exchange deployments all the advantages of Virtualization without:

  • Complexity of deploying/maintaining of database server/s to support management components
  • Deployment of dedicated management clusters to house management workloads
  • Having onsite Subject Matter Experts (SMEs) in Virtualization platform/s

Virtualization adds minimal value

While applications such as Exchange have application level high availability, Virtualization can further improve resiliency and flexibility for the application while making better use of infrastructure investments.

The Nutanix XCP including Acropolis + Acropolis Hypervisor (AHV) ensures infrastructure is completely abstracted from the Operating System and Application allowing it to deliver a more highly available and resilient platform.

Microsoft advice is to limit the maximum compute resources per Exchange server to 24 CPU cores and 96GB RAM. However with CPU core counts continuing to increase, this may result in larger numbers of servers being purchased and maintained where an application specific silo is deployed. This would lead to increased datacenter and licensing costs not to mention operational overhead of managing more infrastructure. As a result, being able to run Exchange alongside other workloads in a mixed environment (where contention can easily be avoided) reduces the total cost of infrastructure while providing higher levels of availability to all workloads.

Virtualization allows Exchange servers to be sized for the current workload and resized quickly and easily if/when required which ensures oversizing is avoided.

Some of the benefits include:

  • Minimizing infrastructure in the datacenter
  • Increasing utilization and therefore value for money of infrastructure
  • Removal of application specific silos
  • Ability to upgrade/replace/performance maintenance on hardware with zero impact to application/s
  • Faster deployment of new Exchange servers
  • Increase availability and provide higher fault tolerance
  • Self-healing capabilities at the infrastructure layer to compliment application level high availability
  • Ability to increase Compute/Storage resources beyond that of the current underlying physical server (Nutanix node) e.g.: Add storage capacity/performance

The Nutanix XCP Advantages (for Exchange)

  • More usable capacity

With features such as In-Line compression giving between 1.3:1 and 1.7:1 capacity savings & Erasure Coding providing up to a further 60% usable capacity, Nutanix XCP can provide more usable capacity than RAW while providing protection from SSD/HDD and entire server failures.

In-Line compression also improved performance of the SATA drives, so its a Win/Win. Erasure coding (EC-X) stores data in a more efficient manner which allows more data to be served from the SSD tier, also a Win/Win.

  • More Messages/Day and/or Users per physical CPU core

With all Write I/O serviced by SSD the CPU WAIT time is significantly reduced which frees up the physical CPU to perform other activities rather than waiting for a slow SATA drive to respond. As MS Exchange is CPU intensive (especially from 2013 onwards) this means more Messages per Day and/or Users can be supported per MSR VM compared to physical servers.

  • Better user experience

As Nutanix XCP is a hybrid platform (SSD+SATA), newer/hotter data is serviced by the SSD tier which means faster response times for users AND less CPU WAIT which also helps further increase CPU efficiencies, again leading to more Messages/Day and/or Users per CPU core.

Summary:

With Cost (CAPEX), Complexity (CAPEX & OPEX) and supportability issues well and truly addressed and numerous clear value adds, running a business critical application like MS Exchange on Nutanix + Acropolis Hypervisor (AHV) will make a lot of sense for many customers.