Rule of Thumb: Sizing for Storage Performance in the new world.

In the new world where storage performance is decoupled with capacity with new read/write caching and Hyper-Converged solutions, I always get asked:

How do I size the caching or Hyper-Converged solution to ensure I get the storage performance I need.

Obviously I work for Nutanix, so this question comes from prospective or existing Nutanix customers, but its also relevant to other products in the market, such as PernixData or any Hybrid (SSD+SAS/SATA) solution.

So for indicative sizing (i.e.: Presales) where definitive information is not available and/or where you cannot conduct a detailed assessment , I use the following simple Rule of Thumb.

Take your last two monthly full backups, and take the delta between them and multiply that by 3.

So if my full backup from August was 10TB and my full backups from September is 11TB, my delta is 1TB. I then multiply that by 3 and we get 3TB which is our assumption of the “Active Working Set” or in basic terms, the data which needs performance. (Because cold or inactive data can sit on any tier without causing performance issues).

Now I  size my SSD tier for 3TB of usable capacity.

The next question is:

Why multiple the backup data delta by 3?

This is based on an assumption (since we don’t have any hard data to go on) that the Read/Write ratio is 70% Read, 30% write.

Now those of you familiar with this thing called Maths, would argue 70/30 is 2.33333 which is true. So rounding up to 3 is essentially a buffer.

I have found this rule of thumb works very well, and customers I have worked with have effectively had All Flash Array performance because the “Active Working Set” all resides within the SSD tier.

Caveats to this rule of thumb.

1. If a customer does a significant amount of deletions during the month, the delta may be smaller and result in an undersized SSD tier.

Mitigation: Review several months of full backup logs and average the delta.

2. If the environment’s Read/Write ratio is much higher than 70/30, then the delta from the backup multiplied by 3 may again result in  an undersized SSD tier.

Mitigation: Perform some investigation into your most critical workloads and validate or correct the assumption of multiplying by 3

3. This rule of thumb is for Server workloads, not VDI.

VDI Read/Write ratio is generally almost opposite to server, and around 30/70 Read/Write. However the SSD tier for VDI should be sized taking into account the benefits of VAAI/VCAI cloning and things like de duplication (for Memory and SSD tiers) which some products, like Nutanix offer.

Summary / Disclaimer

This rule of thumb works for me 90% of the time when designing Nutanix solutions, but your results may vary depending on the platform you use.

I welcome any feedback or suggestions of alternate sizing strategies which I will update the post with where appropriate.

6 thoughts on “Rule of Thumb: Sizing for Storage Performance in the new world.

  1. Really interesting article, thanks for sharing. But would comparing full backups not only give you new data that has been written? Would it not be more accurate to look at incremental backups as a guide to the rate of change. You could sum the incremental backups for a month and then times this by 3.

    • Thanks for the comment.

      I think if there is a lot of deleting of data going on, daily incremental as you suggested would likely be better.

      However I’d suggest this is a corner case, so checking the last couple of months full’s, and taking an average in my experience has worked very well.

      The other idea behind the rule of thumb is simplicity. In most cases, around 2x is fine, but I always stay on the conservative side at a pre-sale stage.

  2. Pingback: Throw away the crystal ball and size with real data. - Nerd Polytechnic

  3. Pingback: Nutanix Platform Link-O-Rama | vcdx133.com

    • Hi Manfred, There is an assumption of a 70/30 read/write ratio which is taken into account when performing the 3x multiplier of the backup delta. So read I/O is taken into account in this sizing. Obviously its not an exact science, and there will be workloads this rule is not applicable to but in my experience its a good rule of thumb, or put another way, a starting point for indicative sizing.