Lab test: vSphere Fault Tolerance performance impact

Triggered by some feedback on the VMware reddit channel, I was wondering what is holding us back in adopting the vSphere Fault Tolerance (FT) feature. Comments on Reddit stated that although the increased availability is desirable, the performance impact is holding them back to actually use it in production environments.

Use cases for FT could be, according to the vSphere 6 documentation center:

  • Applications that need to be available at all times, especially those that have long-lasting client connections that users want to maintain during hardware failure.
  • Custom applications that have no other way of doing clustering or other forms of application resiliency.
  • Cases where high availability might be provided through custom clustering solutions, which are too complicated to configure and maintain.

However, the stated use cases only focus on availability and do not seem to incorporate a performance impact when enabling FT. Is there a sweet-spot for applications that do need high resiliency, but do not require immense performance and could coop with a latency impact due to FT? It really depends on the application workload. A SQL server typically generates more FT traffic then for instance a webserver that primarily transmits. So the impact of enabling FT will impact some workloads more then other.

Requirements

Since the introduction of vSphere 6: Multi-Processor Fault Tolerance (SMP-FT), the requirements for FT are a bit more flexible. The compute maximums for a FT enabled VM are 4 vCPUs and 64GB memory. The use of eager zero thick disks is no longer a requirement. So thin, lazy zeroed thick and eager zero thick provisioned disks are all supported in SMP-FT!
(more…)

Read More

vSphere 6: Multi-Processor Fault Tolerance (SMP-FT)

VMworld 2008; VMware announced Fault Tolerance (FT) in ESX 4 as a new feature that allows continuous availability for selected virtual machines (VM). FT is a technology that allows continuous availability virtual machines with literally zero downtime and zero data loss, even surviving server failures, while staying completely transparent to the guest software stack.

While it was a great new feature; FT enabled VM’s were not a very common sight in datacenter environments.

Legacy FT

FT not being a very common sight in datacenters was mostly due to the restriction of only 1 vCPU per FT virtual machine. This limitation was quite limiting the usability of FT in your datacenter. Most business critical VM’s, that could benefit from FT the most, were in need of multiple vCPU’s in order to meet the performance requirements. Further challenges were the limited options on how to back-up your FT enabled VM’s as creating VMware snapshots was not possible.

Other cluster and host requirements for legacy FT, or UP-FT, were:

  • A HA enabled cluster is required.
  • Shared storage is required.
  • VMDK’s must be eager zeroed thick provisioned.
  • Host CPU’s must be VMware FT capable and belong to the same processor model family.
  • Ensure that all ESX hosts in the VMware HA cluster have identical ESX versions and patch levels.

SMP-FT

(more…)

Read More