Choosing hardware for VMware Cloud Foundation VCF 9.1 deployment

Why Hardware Choice Matters More Than Ever in VCF 9.1

As VMware Cloud Foundation (VCF) 9.1 continues to evolve, prioritising AI-ready infrastructure and higher-density virtualisation, hardware selection and design decisions are becoming substantially more important. This is especially true because many of the most valuable capabilities introduced in VCF 9.1 are heavily dependent on modern hardware architectures, including:

vSAN Express Storage Architecture (ESA)
Memory Tiering
Auto-RAID
GPU-ready infrastructure
Disaggregated storage models

Memory Tiering Capacity and Hardware Requirements

As we outlined in recent product update, Improve Efficiency with Memory Tiering, it provides a valuable additional tier of memory at a substantially lower cost than DRAM which is suitable for a wide variety of use cases and greatly expands the potential consolidation ratio within a host.

Memory Tiering supports up to 4TB of usable capacity, available in either mirrored (high availability) or non-mirrored (non-HA) deployment configurations.

For production environments, a mirrored deployment is recommended. In this setup, two 3.84TB NVMe drives are typically used, providing close to the maximum usable capacity of 4TB.

As a result, the hardware must include at least two drive bays to support Memory Tiering.

vSAN ESA Capacity Efficiency in VCF 9.1

As vSAN is included with VCF licensing at 1TB per CPU core and thanks to the innovations with the Express Storage Architecture (ESA), vSAN has been and attractive option for a wide range of use cases since its introduction in vSphere 8.

Let’s do some basic sizing using two modest specification dual-socket ESXi hosts with cost effective 3.84TB NVMe devices assuming we use Memory Tiering and provide enough physical drives to maximise the licensing value from vSAN @ 1TB/core.

Details	2 x 24c Processor Server	2 x 32c Processor Server
Total CPU Cores	48	64
vSAN Licensed Capacity @ 1TB/Core	48TB	64TB
DRAM	1TB	1TB
Memory Tiering NVMe Devices	2 x 3.84TB	2 x 3.84TB
Memory Tiering Usable Capacity	4TB	4TB
vSAN ESA NVMe Count	12 x 3.84TB	16 x 3.84TB
vSAN RAW Capacity	46.08TB	61.44TB
Total NVMe Drive Bays Required per Host	14	18

Details	2 x 24c Processor Server	2 x 32c Processor Server
Total CPU Cores	48	64
vSAN Licensed Capacity @ 1TB/Core	48TB	64TB
DRAM	1TB	1TB
Memory Tiering NVMe Devices	2 x 3.84TB	2 x 3.84TB
Memory Tiering Usable Capacity	4TB	4TB
vSAN ESA NVMe Count	6 x 7.68TB	8 x 7.68TB
vSAN RAW Capacity	46.08TB	61.44TB
Total NVMe Drive Bays Required per Host	8	10

As we can see, even with larger 7.84TB NVMe devices being used, the required drive bay count to maximise the Return on Investment (ROI) for vSAN and Memory Tiering is between 8 and 10.

At this stage many of you may be thinking that blade servers are not going to provide the required number of drive bays to cater for a modern VCF 9.1 deployment.

And I’d agree.

Others will be asking “Why didn’t you simply use larger capacity NVMe devices?”

We can absolutely use larger capacity NVMe drives despite these likely coming at a higher cost/GB, but putting that aside, for the same RAW capacity we would need 3 x 15.44TB NVMe drives for the 2 x 24c Server and 4 x 15.44TB NVMe drives for the 2 x 32c Server.

Given our total drive bay requirement is now between 5 and 6, Blades are now fine for Memory Tiering and vSAN ESA right?

We would now need to consider than while the capacity requirements are being met, the performance will be limited due to the lower number of NVMe devices when compared to a rack mount server using 16 x 3.84TB NVMe drives in the earlier example.

vSAN Drive Failures and Business Continuity Considerations

What happens when a NVMe drive fails? vSAN performs an efficient rebuild operation and restores the Failures to Tolerate (FTT) to the configured storage policy.

But if we have larger capacity NVMe devices, the impact of a single failure is much greater. In the above examples, we could have a 3.84TB device fail which is 6.25% of a rack mount server with 16 x 3.84TB drives, or a 15.44TB drive fail which is 33% of the storage in the 2 x 24c server example. This means not only have we lost more capacity in our host, we also need to perform a rebuild operation for up to 4x the data.

This rebuild operation will use considerably more CPU cycles, networking bandwidth, NVMe resources to perform the rebuild and will obviously take longer than rebuilding a smaller 3.84TB device. In addition to these, we also need to consider the impact of these backend tasks on the cluster which will impact on virtual machines even in a well-designed environment.

These are just some of the reasons why a larger number of NVMe devices are more attractive especially in production environments.

Rising Network Demands in VCF Environments

As we turn our attention to the ever-increasing networking demands of VCF environments, let’s consider a few factors:

Memory Tiering is enabling higher density of workloads
Higher density means larger total memory usage
Higher Total Memory usage means higher demand on cluster load balancing via DRS and during maintenance operations host evacuations
Higher density of workloads means:
More Virtual Machine network traffic e.g.:
Client – Server
Server – Server
Database – App – Web/Client
More Virtual Machine storage operations including:
vSAN & External Storage such as iSCSI/NFS
More Virtual Machine backup operations
vSAN ESA delivers excellent performance, which requires East-West network communications for all Write I/O and any non-local Read I/O

Key Challenges in Blade Server Environments

Why are these factors such an issue in blade server environments? It’s a simple matter of blade chassis having a finite number of physical connections which typically result in varying levels of oversubscription. This doesn’t occur with rack mount servers. These are just a few considerations which make blade servers/chassis less attractive due in large part to the level of networking oversubscription which often exists in the environment in addition to the limited drive bays for NVMe devices to support Memory Tiering and vSAN.

If we shift our focus to use cases requiring GPUs then blades also have more limitations than a rack mount servers. One of the major advantages of blade servers is they can provide higher rack density and some power/cooling benefits along with using fewer network ports. The density benefits of blades would likely only apply for basic virtualisation and for workloads which don’t have high networking/storage requirements which is an increasingly small percentage of workloads.

We recently conducted a detailed TCO/ROI assessment for a major managed service provider deploying VCF 9.0 with vSAN and Pure Storage. The assessment compared a wide range of factors including blades vs rack mount servers, despite factoring in as many possible benefits for the blade chassis deployment, the result was strongly in favour of rack mount servers. The added storage capability from vSAN ESA gave them another tier of storage to compliment the Pure Storage which provided them a very attractive ROI for the new server hardware. The clients decided to go with a rack mount form factor moving forward and deprecate the blade chassis over time and they went End of Life.

The following table summarises our recommended form factor for different deployment objectives.

Deployment Objective	Recommended Form Factor
Traditional Virtualisation	Blade or Rack Mount
vSAN Express Storage Architecture (ESA)	Rack Mount
Memory Tiering Only	Blade or Rack Mount
Memory Tiering & vSAN ESA	Rack Mount
AI-ready Infrastructure	Rack Mount
GPU workloads	Rack Mount
Networking intensive workloads	Rack Mount
Scalability (Capacity, Performance, Throughput)	Rack Mount
Business Critical Applications	Rack Mount

Rethinking Infrastructure Choices for Modern VCF 9.1 Environments

With VCF 9.1 deployments, the once popular blade servers are becoming far less attractive for a wide range of modern workloads and features. Architects must always critically evaluate the unique technical, operational, and business requirements of each project rather than assuming a previous design/preference remains the best choice.

Technologies, workloads, scalability requirements, operational models, and platform capabilities evolve rapidly, particularly in modern VCF 9.1 environments where capabilities such as vSAN ESA, Memory Tiering, AI infrastructure, and high-bandwidth networking fundamentally change traditional infrastructure design assumptions.

Successful business outcomes are built on architecture based on informed assessment and alignment to current requirements, not “what has worked in the past”. If you’re interested to discuss how you can optimise your VCF 9.1 environment for modern workloads, contact us at Zaleo Consulting.

< Older Post

Hands typing on a laptop with translucent digital interface icons in a blue-green tech setting

Choosing the Right Hardware for a VMware Cloud Foundation (VCF) 9.1 Deployment

Why Hardware Choice Matters More Than Ever in VCF 9.1

Memory Tiering Capacity and Hardware Requirements

vSAN ESA Capacity Efficiency in VCF 9.1

vSAN Drive Failures and Business Continuity Considerations

Rising Network Demands in VCF Environments

Key Challenges in Blade Server Environments

Rethinking Infrastructure Choices for Modern VCF 9.1 Environments

More From Zaleo Consulting

VCF 9.1 Upgrade Benefits: Unlocking vSAN ESA Auto-RAID for Smarter Storage

VCF 9.1 Upgrade Benefits: Improve Efficiency with Memory Tiering

Why You Should Upgrade to VCF 9.1: Key Benefits & Features

Company

Contact Us

Social