Example Architectural Decision – Guest OS Page File Storage in vSphere

Problem Statement

In a vSphere environment using deduplication and an array snapshot based backup solution, Guest OS page files are currently stored on the OS drive (VMDK) which reduces the effectiveness of deduplication as well as placing an overhead on the controllers having to scan data which cannot be deduplicated.

As the Guest OS Paging files are being included in the snapshot process (with the guest OS) this also demands additional capacity for both primary and secondary disk storage for disk to disk backups.

How can this overhead be minimized or eliminated?

Requirements

1. Make the most efficient use of the available storage capacity
2. Maintain consistent level of virtual machine / storage performance
3. Minimize the storage required for primary and secondary snapshot based backups
4. Maintain the array level snapshot based backup solution as it is required to meet RPO/RTOs
5. Maintain the use of deduplication and this has proven to decrease storage requirements and improve performance

Assumptions

1. vSphere 5.0 or later
2. VMFS 5 Datastores which are Thin Provisioned
3. Deduplication is in use for Volumes where Guest OS virtual disks are stored
4. VAAI is supported by the array and enabled across the vSphere environment
5. All datastores are presented to all hosts within the cluster
6. Snapshot based backup solution is being used
7. Virtual Machines are right sized
8. Disk to disk backup data is replicated offsite

Constraints

1. None

Motivation

1. Optimize the storage performance
2. Ensure Tier 1 storage is not wasted with transient files
3. Minimize storage required for snapshot based backups

Architectural Decision

Separate OS page files onto a dedicated VMDK, which will be located on a datastore (or datastore cluster) which is
1. Not Protected by the array level snapshot backup solution
2. Not running deduplication
3. Not running data compression

Justification

1. Allows page files to be stored on different underlying storage including (optionally) high capacity, lower cost, SATA disk
2. Relocating Guest OS page files to another datastore (or datastore cluster) not protected ny snapshots dramatically reduces the amount of Data being protected by the Snapshot based backup solution
3. Reduces the amount of data being replicated to secondary disk backup location/s thus minimizing the bandwidth requirements between datacenters
4. (Optionally) Ensures Tier 1 storage is only used for high performance guests
5. The result of the Virtual Machines being right sized the performance impact/frequency of paging should be minimal
6. Reduces the CPU cycles required for deduplication as data which cannot be deduplicated will not be scanned
7. Reduces the CPU cycles on the storage controllers by not attempting to compress page file data

Alternatives

1. Leave Page Files within the Virtual machines primary VMDK an accept the overhead on the backup solution
2. Turn of paging within the Guest OS (No Page File)

Implications

1. The additional steps of creating a dedicated VMDK for the VM and configuring the Guest OS to use the alternate location
2. Templates need to be updated to the above configuration
3. For environments using Site Recovery Manager,for protected virtual machines, some manual steps are required when setting up the virtual machines for the first time. This increases the work required during setup, however as this is a one time overhead, it is believed the benefit of reduced backup storage and replication traffic (for SRM) outweighs the one time overhead

vmware_logo_ads

Example Architectural Decision – Storage Protocol Choice for a VMware View Environment using Linked Clones

Problem Statement

What is the most suitable storage protocol for a Virtual Desktop (VMware View) environment using Linked Clones?

Assumptions

1.  The Storage Array supports NFS native snapshot offload
2. VMware View 5.1 or later

Motivation

1. Minimize recompose (maintenance) window
2. Minimize impact on the storage array and HA/DRS cluster during recompose activities
3. Reduce storage costs where possible
4. Simplify the storage design eg: Number/size of Datastores / Storage Connectivity
5. Reduce the total solution cost eg: Number of Hosts required

Architectural Decision

Use Network File System (NFS)

Justification

1. Using native NFS snapshot (VAAI) offloads the creation of VMs to the array, therefore reducing the compute overhead on the ESXi hosts
2. Native NFS snapshots require much less disk space than traditional linked clones
3. Recomposition times are reduced due to the offloading of the cloning to the array
4. More virtual machines can be supported per NFS datastore compared to VMFS datastores (200+ for NFS compared to max recommended of 140, but it is generally recommended to design for much lower numbers eg: 64 per VMFS)
5. Recompositions/Refresh activities can be performed during business hours, or at Logoff (for Refresh) with minimal impact to the HA/DRS cluster, thus giving more flexibility to maintain the environment
6. Avoid’s potential VMFS locking issues – although this issue is not as important for environments using vSphere 4.1 onward with VAAI compatible arrays
7. When sizing your storage array, less capacity is required. Note: Performance sizing is also critical
8. The cost of a FC Storage Area Network can be avoided
9. Fewer ESXi hosts may be required as the compute overhead of driving cloning has been removed

Implications

1.  In the current release, 5.1, View Storage Accelerator (formally Content Based Read Cache or CBRC) is not supported when using Native NFS snapshots (VAAI)
2. Also in the current release 5.1, “Use native NFS snapshots (VAAI) is in “Tech Preview” – This is rumored to change in View 5.2

Alternatives

1. Use VMFS (block) based datastores and have more VMFS datastores – Note: Recompose activity will be driven by the host which adds an overhead to the cluster.