Why Hardware Choice Matters More Than Ever in VCF 9.1
As VMware Cloud Foundation (VCF) 9.1 continues to evolve, prioritising AI-ready infrastructure and higher-density virtualisation, hardware selection and design decisions are becoming substantially more important. This is especially true because many of the most valuable capabilities introduced in VCF 9.1 are heavily dependent on modern hardware architectures, including:
- vSAN Express Storage Architecture (ESA)
- Memory Tiering
- Auto-RAID
- GPU-ready infrastructure
- Disaggregated storage models
Memory Tiering Capacity and Hardware Requirements
As we outlined in recent product update, Improve Efficiency with Memory Tiering, it provides a valuable additional tier of memory at a substantially lower cost than DRAM which is suitable for a wide variety of use cases and greatly expands the potential consolidation ratio within a host.
Memory Tiering supports up to 4TB of usable capacity, available in either mirrored (high availability) or non-mirrored (non-HA) deployment configurations.
For production environments, a mirrored deployment is recommended. In this setup, two 3.84TB NVMe drives are typically used, providing close to the maximum usable capacity of 4TB.
As a result, the hardware must include at least two drive bays to support Memory Tiering.
vSAN ESA Capacity Efficiency in VCF 9.1
As vSAN is included with VCF licensing at 1TB per CPU core and thanks to the innovations with the Express Storage Architecture (ESA), vSAN has been and attractive option for a wide range of use cases since its introduction in vSphere 8.
Let’s do some basic sizing using two modest specification dual-socket ESXi hosts with cost effective 3.84TB NVMe devices assuming we use Memory Tiering and provide enough physical drives to maximise the licensing value from vSAN @ 1TB/core.
| Details | 2 x 24c Processor Server | 2 x 32c Processor Server |
|---|---|---|
| Total CPU Cores | 48 | 64 |
| vSAN Licensed Capacity @ 1TB/Core | 48TB | 64TB |
| DRAM | 1TB | 1TB |
| Memory Tiering NVMe Devices | 2 x 3.84TB | 2 x 3.84TB |
| Memory Tiering Usable Capacity | 4TB | 4TB |
| vSAN ESA NVMe Count | 12 x 3.84TB | 16 x 3.84TB |
| vSAN RAW Capacity | 46.08TB | 61.44TB |
| Total NVMe Drive Bays Required per Host | 14 | 18 |
Here we see an optimal hardware configuration to maximise the value of VCF licensing by using vSAN and Memory tiering requires between 14 and 18 drive bays.
Now let’s do the same comparison with larger 7.86TB NVMe drives for vSAN leaving the 3.84TB drives for Memory Tiering due to the 4TB usable capacity limitation.
| Details | 2 x 24c Processor Server | 2 x 32c Processor Server |
|---|---|---|
| Total CPU Cores | 48 | 64 |
| vSAN Licensed Capacity @ 1TB/Core | 48TB | 64TB |
| DRAM | 1TB | 1TB |
| Memory Tiering NVMe Devices | 2 x 3.84TB | 2 x 3.84TB |
| Memory Tiering Usable Capacity | 4TB | 4TB |
| vSAN ESA NVMe Count | 6 x 7.68TB | 8 x 7.68TB |
| vSAN RAW Capacity | 46.08TB | 61.44TB |
| Total NVMe Drive Bays Required per Host | 8 | 10 |
As we can see, even with larger 7.84TB NVMe devices being used, the required drive bay count to maximise the Return on Investment (ROI) for vSAN and Memory Tiering is between 8 and 10.
At this stage many of you may be thinking that blade servers are not going to provide the required number of drive bays to cater for a modern VCF 9.1 deployment.
And I’d agree.
Others will be asking “Why didn’t you simply use larger capacity NVMe devices?”
We can absolutely use larger capacity NVMe drives despite these likely coming at a higher cost/GB, but putting that aside, for the same RAW capacity we would need 3 x 15.44TB NVMe drives for the 2 x 24c Server and 4 x 15.44TB NVMe drives for the 2 x 32c Server.
Given our total drive bay requirement is now between 5 and 6, Blades are now fine for Memory Tiering and vSAN ESA right?
We would now need to consider than while the capacity requirements are being met, the performance will be limited due to the lower number of NVMe devices when compared to a rack mount server using 16 x 3.84TB NVMe drives in the earlier example.
vSAN Drive Failures and Business Continuity Considerations
What happens when a NVMe drive fails? vSAN performs an efficient rebuild operation and restores the Failures to Tolerate (FTT) to the configured storage policy.
But if we have larger capacity NVMe devices, the impact of a single failure is much greater. In the above examples, we could have a 3.84TB device fail which is 6.25% of a rack mount server with 16 x 3.84TB drives, or a 15.44TB drive fail which is 33% of the storage in the 2 x 24c server example. This means not only have we lost more capacity in our host, we also need to perform a rebuild operation for up to 4x the data.
This rebuild operation will use considerably more CPU cycles, networking bandwidth, NVMe resources to perform the rebuild and will obviously take longer than rebuilding a smaller 3.84TB device. In addition to these, we also need to consider the impact of these backend tasks on the cluster which will impact on virtual machines even in a well-designed environment.
These are just some of the reasons why a larger number of NVMe devices are more attractive especially in production environments.
Rising Network Demands in VCF Environments
As we turn our attention to the ever-increasing networking demands of VCF environments, let’s consider a few factors:
- Memory Tiering is enabling higher density of workloads
- Higher density means larger total memory usage
- Higher Total Memory usage means higher demand on cluster load balancing via DRS and during maintenance operations host evacuations
- Higher density of workloads means:
- More Virtual Machine network traffic e.g.:
- Client – Server
- Server – Server
- Database – App – Web/Client
- More Virtual Machine storage operations including:
- vSAN & External Storage such as iSCSI/NFS
- More Virtual Machine backup operations
- vSAN ESA delivers excellent performance, which requires East-West network communications for all Write I/O and any non-local Read I/O
Key Challenges in Blade Server Environments
Why are these factors such an issue in blade server environments? It’s a simple matter of blade chassis having a finite number of physical connections which typically result in varying levels of oversubscription. This doesn’t occur with rack mount servers. These are just a few considerations which make blade servers/chassis less attractive due in large part to the level of networking oversubscription which often exists in the environment in addition to the limited drive bays for NVMe devices to support Memory Tiering and vSAN.
If we shift our focus to use cases requiring GPUs then blades also have more limitations than a rack mount servers. One of the major advantages of blade servers is they can provide higher rack density and some power/cooling benefits along with using fewer network ports. The density benefits of blades would likely only apply for basic virtualisation and for workloads which don’t have high networking/storage requirements which is an increasingly small percentage of workloads.
We recently conducted a detailed TCO/ROI assessment for a major managed service provider deploying VCF 9.0 with vSAN and Pure Storage. The assessment compared a wide range of factors including blades vs rack mount servers, despite factoring in as many possible benefits for the blade chassis deployment, the result was strongly in favour of rack mount servers. The added storage capability from vSAN ESA gave them another tier of storage to compliment the Pure Storage which provided them a very attractive ROI for the new server hardware. The clients decided to go with a rack mount form factor moving forward and deprecate the blade chassis over time and they went End of Life.
The following table summarises our recommended form factor for different deployment objectives.
| Deployment Objective | Recommended Form Factor |
|---|---|
| Traditional Virtualisation | Blade or Rack Mount |
| vSAN Express Storage Architecture (ESA) | Rack Mount |
| Memory Tiering Only | Blade or Rack Mount |
| Memory Tiering & vSAN ESA | Rack Mount |
| AI-ready Infrastructure | Rack Mount |
| GPU workloads | Rack Mount |
| Networking intensive workloads | Rack Mount |
| Scalability (Capacity, Performance, Throughput) | Rack Mount |
| Business Critical Applications | Rack Mount |
Rethinking Infrastructure Choices for Modern VCF 9.1 Environments
With VCF 9.1 deployments, the once popular blade servers are becoming far less attractive for a wide range of modern workloads and features. Architects must always critically evaluate the unique technical, operational, and business requirements of each project rather than assuming a previous design/preference remains the best choice.
Technologies, workloads, scalability requirements, operational models, and platform capabilities evolve rapidly, particularly in modern VCF 9.1 environments where capabilities such as vSAN ESA, Memory Tiering, AI infrastructure, and high-bandwidth networking fundamentally change traditional infrastructure design assumptions.
Successful business outcomes are built on architecture based on informed assessment and alignment to current requirements, not “what has worked in the past”. If you’re interested to discuss how you can optimise your VCF 9.1 environment for modern workloads,
contact us at
Zaleo Consulting.






