The balancing act of choosing a solution based on Requirements & Budget Constraints

I am frequently asked how to architect solutions when you have constraint/s which prevent you meeting requirements. It is not unusual for customers to have an expectation that they need and can get the equivalent of a Porsche 911 turbo for the price of a Toyota Corolla.

It’s also common for less experienced architects to focus straight away on budget constraints before going through a reasonable design phase and addressing requirements.

If you are constrained to the point you cannot afford a solution that comes close to meeting your requirements, the simple fact is, that customer is in pretty serious trouble no manner how you look at it. But this is rarely the case in my experience.

Typically customers simply need to sit down with an experienced architect and go through what business outcomes they want to achieve. Then the architect will ask a range of questions to help clarify the business goals and translate those into clearly defined Requirements.

I also find customers frequently think they need (or have been convinced by a vendor) that they need more than what they do to achieve the outcomes, e.g.: Thinking you need Active/Active datacenters with redundant 40Gb WAN links when all you need is VM High Availability and async rep to DR on 1Gb WAN links.

So my point here is always start with the following:

  • Step 1: What is the business problem/s the customer is trying to solve.

Until the customer, VAR , Solution Architect and vendor/s understand this and it is clearly documented, do not proceed any further. Without this information, and a clear record of what needs to be achieved, the project will likely fail.

At this stage its important to also understand any constraints such as Project Timelines, CAPEX budget & OPEX budget.

  • Step 2: Research potential solutions

Once you have a clear understanding of the problem, desired outcome / requirements, then its time for you to research (or engage a VAR to do this on your behalf) what potential products could provide a solution that addresses the business outcome, requirements etc.

  • Step 3: Provide VAR and/or Vendors detailed business problem/s, requirements and constraints.

This is where the customer needs to take some responsibility. A VAR or Vendor who is kept in the dark is unlikely to be able to deliver anything close to the desired outcome. While customers don’t like to give out information such as budget, without it, it just creates more work for everyone, and ultimately drives up the cost to the VAR/vendor which in turn gets passed onto customers.

With a detailed understanding of the desired business outcomes, requirements and constraints the VAR/vendor should provide a high level indicative solution proposal with details on CAPEX and ideally OPEX to show a TCO.

In many cases the lowest CAPEX solution has the highest OPEX which can mean a significantly higher TCO.

cheaper

  • Step 4: Customer evaluates High level Solution Proposal/s

The point here is to validate if the proposed solution will provide the desired business outcome and if it can do so within the budget. At this stage it is importaint to understand how the solution meets/exceeds each requirement and being able to trace a design back to the business outcomes. This is why it’s critical to document the requirements so the high level design (and future detailed design) can address each criteria.

If the proposed solution/s do meet/exceed the business requirements, then the question is, Do they fall within the allocated budget?

If so, great! Choose your preferred vendor solution and proceed.

If not, then a proposal needs to be put to the business for additional budget. If that is approved, again great and you can proceed.

If additional CAPEX/OPEX cannot be obtained then this is where the balancing act really gets interesting and an experienced architect will be of great value.

  • Step 5: Reviewing Business outcomes/requirements (and prioritise requirements!)

So lets say we have three requirements, R001, R002 and R003. All are importaint, but the simple fact at this point is the budget is insufficient to deliver them all.

This is where a good solution architect sets the customers expectations as a trusted adviser. The expectation needs to be clearly set that the budget is insufficient to deliver all the desired business outcomes and the priorities need to be set.

Sit with the customer and put all requirements into priority order, then its back and forth with the vendors to provide a lower cost (CAPEX/OPEX or both) detailing the requirement which have and have not been met and any/all implications.

In my opinion the key in these situations is not to just “buy the best you can” but to document in detail what can and cannot be achieved with detailed explanations on the implications. For example, an implication might be the RPO goes from 1hr to 8hrs and the RTO for a business critical application extended from 1hr to 4hrs. In day to day operations the customer would likely not know the difference, but if/when an outage occurred, they would know and need to be prepared. The cost of a single outage can in many cases cost much more than the solution itself, which is why its critical to document everything for the customer.

In my experience, where the implications are significant and clearly understood by the customer (at both a technical and business level), it is not uncommon for customers to revisit the budget and come up with additional funds for the project. It is the job of the Solution Architect (who should always be the customer advocate) to ensure the customer understands what the solution can/cannot deliver.

Where the customer understands the implications (a.k.a Risks), it is importaint that they sign off on the risks prior to going with a solution that does not meet all the desired outcomes. The risks should be clearly documented ideally with examples of what could happen in situations such as failure scenarios.

Basic Flowchart of the above described process:

The below is a basic flowchart showing a simplified process of gathering business requirements/outcomes through to an outcome.

Starting in the top left, we have the question “Are the business problem/s you are trying to solve and the requirements documented?

From there its a follow the bouncing ball until we come to the dreaded “Is the solution/s within the budget constraints?”. This area is coloured grey for a reason, as this is where the balancing act occurs and delays can happen as indicated with the following shape DelayIconGreay.

Once the customer fully understand what they are or are not getting AND IT IS FULLY DOCUMENTED, then and only then should a customer (or VAR/Architect recommend too) proceed with a proposal.

Flowchart01

The above is not an exhausting process, but just something to inspire some thought next time you or your customers are constrained by budget.

Summary:

  • Don’t just “Buy what you can afford and hope for the best”
  • Document how requirements are (or are not) being met (That’s “Traceability” for all you VCDX/NPX candidates)
  • Document any risks/implications and mitigation strategies
  • Get sign off on any requirements which is not met
  • Always provide a customer with options to choose from, don’t assume they wont invest more to address risks to the business
  • If you can’t meet all requirements Day 1, ensure the design is scalable. A.k.a Start small and scale if/when required.
  • Avoid low cost solutions that will likely end up having to be thrown out

Related Articles:

Enterprise Architecture & Avoiding tunnel vision.

Recently I have read a number of articles and had several conversations with architects and engineers across various specialities in the industry and I’m finding there is a growing trend of SMEs (Subject Matter Experts) having tunnel vision when it comes to architecting solutions for their customers.

What I mean by “Tunnel Vision” is that the architect only looks at what is right in front of him/her (e.g.: The current task/project) , and does not consider the implications of how the decisions being made for this task may impact the wider I.T infrastructure and customer from a commercial / operational perspective.

In my previous role I saw this all to often, and it was frustrating to know the solutions being designed and delivered to the customers were in some cases quite well designed when considered in isolation, but when taking into account the “Big Picture” (or what I would describe as the customers overall requirements) the solutions were adding unnecessary complexity, adding risk and increasing costs, when new solutions should be doing the exact opposite.

Lets start with an example;

Customer “ACME” need an enterprise messaging solution and have chosen Microsoft Exchange 2013 and have a requirement that there be no single points of failure in the environment.

Customer engages an Exchange SME who looks at the requirements for Exchange, he then points to a vendor best practice or reference architecture document and says “We’ll deploy Exchange on physical hardware, with JBOD & no shared storage and use Exchange Database Availability Groups for HA.”

The SME then attempts to justify his recommendation with “because its Microsoft’s Best practice” which most people still seem to blindly accept, but this is a story for another post.

In fairness to the SME, in isolation the decision/recommendation meets the customers messaging requirements, so what’s the problem?

If the customers had no existing I.T and the messaging system was going to be the only I.T infrastructure and they had no plans to run any other workloads, I would say the solution proposed could be a excellent solution, but how many customers only run messaging? In my experience, none.

So lets consider the customer has an existing Virtual environment, running Test/Dev, Production and Business Critical applications and adheres to a “Virtual First” policy.

The customer has already invested in virtualization & some form of shared storage (SAN/NAS/Web Scale) and has operational procedures and expertises in supporting and maintaining this environment.

If we were to add a new “silo” of physical servers, there are many disadvantages to the customer including but not limited too;

1. Additional operational documentation for new Physical environment.

2. New Backup & Disaster Recovery strategy / documentation.

3. Additional complexity managing / supporting a new Silo of infrastructure.

4. Reduced flexibility / scalability with physical servers vs virtual machines.

5. Increased downtime and/or impact in the event hardware failures.

6. Increased CAPEX due to having to size for future requirements due to scaling challenges with physical servers.

So what am I getting at?

The cost of deploying the MS Exchange solution on physical hardware could potentially be cheaper (CAPEX) Day 1 than virtualizing the new workload on the existing infrastructure (which likely needs to be scaled e.g.: Disk Shelves / Nodes) BUT would likely result overall higher TCO (Total Cost of Ownership) due to increased complexity & operational costs due to the creation of a new silo of resources.

Both a physical or virtual solution would likely meet/exceed the customers basic requirement to serve MS Exchange, but may have vastly different results in terms of the big picture.

Another example would be a customer has a legacy SAN which needs to be replaced and is causing issues for a large portion of the customers workloads, but the project being proposed is only to address the new Enterprise messaging requirements. In my opinion a good architect should consider the big picture and try to identify where projects can be combined (or a projects scope increased) to ensure a more cost effective yet better overall result for the customer.

If the architect only looked at Exchange and went Physical Servers w/ JBOD, there is zero chance of improvement for the rest of the infrastructure and the physical equipment for Exchange would likely be oversized and underutilized.

It will in many cases be much more economical to combine two or more projects, to enable the purchase of a new technology or infrastructure components and consolidate the workloads onto shared infrastructure rather than building two or more silo’s which add complexity to the environment, and will likely result in underutilized infrastructure and a solution which is inferior to what could have been achieved by combining the projects.

In conclusion, I hope that after reading this article, the next time you or your customers embark on a new project, that you as the Architect, Project Manager, or Engineer consider the big picture and not just the new requirement and ensure your customer/s get the best technical and business outcomes and avoid where possible the use of silos.

VCDX Defence Essentials – Part 2 – Preparing for the Design Scenario

Following on from Part 1 – Preparing for the Design Defence, Part 2 covers my tips for the Design Scenario part of the VCDX defence.

After a short break following your 75min Design defence, your neck deep in the Design Scenario. You are presented with a scenario which you need to demonstrate your abilities to gather requirements and while you will not be able to complete a design in 30mins, you should be able to demonstrate the methodology you use to start the process.

As mentioned in Part 1, I am not a official panellist and I do not know how the scoring works. The below is my advice based on conducting mock panels, the success rate of candidates I have conducted mock panels with and my successfully achieving VCDX on the 1st attempt.

Common Mistakes

1. Not gathering and identifying requirements/constraints & risks

The design scenario is very high level, and does not provide you with all the information required to be able to properly start a design. Not identifying and clarifying requirements/constraints and risks will in most cases prevent a candidate from successfully being able to start the design process.

Note: The word “Start” is underlined! You can’t start a design without knowing what your designing for… so don’t make this mistake.

2. Not documenting the requirements/constraints & risks

Assuming you have not made Mistake #1, and you have gathered and clarified the requirements/constraints & risks, the next mistake is not to write them down. I have seen many candidates do an excellent job of gathering the information, to then fall in a heap because they waste time asking the same questions over again because the have forgotten the details.

30 mins is not a long time, you cannot afford to waste time repeating questions.

3. Going down a rabbit hole

I have observed many candidates who are clearly very knowledgeable, who have spent 10-15 mins talking about one topic, such as HA and going into admission control options and pros/cons, isolation response etc. They demonstrated lots of expertise, but this did not help getting as much progress as possible into a design within the time constraint.

The design may be excellent in one key area (eg: HA) but severely lacked in all other areas, which would certainly led to a low score in the design scenario.

4. Not adjusting to changes

The information given to you in the design scenario may not always be correct and may even change half way through the design. Just like in a customer meeting, the customer doesn’t always know the answers to your questions, and may give you an incorrect answer, or simply not know the answer, then later on, realise they gave you incorrect information and correct themselves.

I deliberately throw curve-balls into mock design scenarios and I have observed several times a candidate be say 25 mins into the design scenario and this happens, and they failed to adjust for whatever reason/s.

5. Being Mute!

I have seen candidates who stand starring at the whiteboard, or drawing away madly, while completely mute. Then after 5-10 mins of drawing/thinking candidates then talk about what they came up with.

Do you stand in customer meetings mute? No! (Well, you shouldn’t!)

 

Tips for the Design Scenario

1. Clarify the Requirements/Constraints

Start by clarifying the information that has been provided to you. The information provided may be contradictory, so get this sorted before going any further.

2. Write the requirements/constraints & risks on the Whiteboard

Once you have clarified a piece of information that has been provided to you, write it on the whiteboard under section heading, such as:

a) Requirements

b) Constraints

c) Risks

d) Assumptions

Now, you can quickly review these items, without having to remember everything and if a curve-ball is thrown at you, you can cross out the incorrect information and write down the correct info and this may assist you modifying your design to cater for the changed requirement/constraint etc.

As you work through the scenario, you may be able to clarify an assumption, so you can remove it as an assumption/risk, this shows your working towards a quality outcome.

3. Write down your decisions!

Ensure you address each of the key areas of a vSphere solution by writing on the whiteboard headings like the following:

a) Storage

b) Networking

c) Compute

d) Availability

e) Datacenter

Ensure you write down at least 3 items per section, so you are covering off the entire environment.

As you make a design choice, write it down, eg: under storage, you may be recommending or constrained to use iSCSI, so write it down. iSCSI / Block storage.

So, aim to have 5 section headings like the above examples, and at  least 3 items per heading by the end of 30 mins. If you do the math, that’s only 6 mins per section, or 2 mins per item so make them count.

eg: Availability does not just mean N+1 vSphere cluster, what about say, environmental items such as UPS? A successful VCDX level design is not just about vSphere.

4. Verbalize your thought process.

I cannot give you advise, if I don’t know what your thinking! Same with the panellists, they can’t score you if you don’t verbalize your thought process.

No matter what, keep thinking out loud, if your working through options in your mind, that’s what the panel want’s to hear, so let them hear it!

If you are mute for a large portion of the 30 mins, the lower the chances you have of increasing your score.

5. Show how you adjust to changes in requirements/constraints/assumptions!

As a VCDX candidate, your most likely an architect day to day, so you would have dealt with this many times in real life, so deal with it in the design scenario!

If your 25 mins into the design scenario, and the panel suddenly tells you the CIO went out drinking on the weekend with his new buddy at storage vendor X and decided to scrap the old vendors storage and go for another vendor, deal with it!

Talk about the implications of moving from vendor X to vendor Y, for example FC to NFS and how this would change the design and would it still meet the requirements or would it be a risk?

6. Don’t be afraid to draw diagrams – but don’t spend all day making it pretty!

Use the whiteboard to draw your solution as it develops, but don’t waste time drawing fancy diagrams. A square box with ESXi written in it, is a Host, it doesn’t need to be pretty.

eg: If your drawing a 16 node cluster, draw three squares, Labelled ESXi01, ESXi…. and ESXi16, don’t draw 16 boxes, this adds no value, wastes time, and makes the diagram harder to draw.

 

Summary

I hope the above tips help you prepare for the VCDX design scenario and best of luck with your VCDX journey. For those who are interested, you can read about My VCDX Journey.

In Part 3, I will go through Preparing for the Troubleshooting Scenario, and how to maximize your 15 mins.