vSphere 8 vMotion Improvements

vSphere 8 was announced at VMware Explore 2022 in San Francisco. Part of this new major release are vMotion updates. vMotion is extensively developed to support new workloads and is one of the key enablers for VMware’s multi-cloud approach. Whenever a workload is live-migrated between vSphere and/or VMware Cloud infrastructures, vMotion logic is involved.

vMotion over time

vMotion In-App Notifications

With the release of vSphere 7, the vMotion logic was updated to greatly reduce the performance impact on applications during live migrations. Significant work was done to minimize the stun-time, aka the switch-over time, where the last memory pages and checkpoint info are transferred from source to destination. (more…)

Read More

Assignable Hardware in vSphere 7

–originally authored and posted by me at https://blogs.vmware.com/vsphere/2020/03/vsphere-7-assignable-hardware.html–

A wide variety of modern workloads greatly benefit from using hardware accelerators to offload certain capabilities, save CPU cycles, and gain a lot of performance in general. Think about the telco industry for example: Network Function Virtualization (NFV) platforms utilizing NICs and FPGAs. Or customers that use GPUs for graphic acceleration in their Virtual Desktop Infrastructure (VDI) deployment. The AI/ML space is another example of workloads where applications are enabled to use GPUs to offload computations. To utilize a hardware accelerator with vSphere, typically a PCIe device, the device needs to be exposed to guest OS running inside the virtual machine.

In vSphere versions prior to vSphere 7, a virtual machines specifies a PCIe passthrough device by using its hardware address. This is an identifier that points to a specific physical device at a specific bus location on that ESXi host. This restricts that virtual machine to that particular host. The virtual machine cannot easily be migrated to another ESXi host with an identical PCIe device. This can impact the availability of the application using the PCIe device, in the event of host outage. Features like vSphere DRS and HA are not able to place that virtual machine on a another, or surviving, host in the cluster. It takes manual provisioning and configuration to be able to move that virtual machine to another host.

We do not want to compromise on application availability and ease of deployment. Assignable Hardware is a new feature in vSphere 7 that overcomes these challenges.

Introducing Assignable Hardware

Assignable Hardware in vSphere 7 provides a flexible mechanism to assign hardware accelerators to workloads. This mechanism identifies the hardware accelerator by attributes of the device rather than by its hardware address. This allows for a level of abstraction of the PCIe device. Assignable Hardware implements compatibility checks to verify that ESXi hosts have assignable devices available to meet the needs of the virtual machine.

It integrates with Distributed Resource Scheduler (DRS) for initial placement of workloads that are configured with a hardware accelerator. This also means that Assignable Hardware brings back the vSphere HA capabilities to recover workloads (that are hardware accelerator enabled) if assignable devices are available in the cluster. This greatly improves workload availability.

 

The Assignable Hardware feature has two consumers; The new Dynamic DirectPath I/O and NVIDIA vGPU. (more…)

Read More

Improved DRS in vSphere 7

–originally authored and posted by me at https://blogs.vmware.com/vsphere/2020/03/how-is-virtual-memory-translated-to-physical-memory.html–

The first release of Distributed Resource Scheduling (DRS) dates back to 2006. Since then, data centers and workloads have changed significantly. The new vSphere 7 release is shipped with DRS enhancements to better support modern workloads by using an improved DRS logic and new accompanying UI in the vSphere Client.

The enhanced DRS logic is now workload-centric rather than cluster-centric, as it was before with DRS. The DRS logic is completely rewritten to have a more fine-grained level of resource scheduling with the main focus on workloads. This blog post goes into detail on the new DRS algorithm, and explains how to interpret the metrics as seen in the new UI.

The Old DRS

vSphere DRS used to focus on the cluster state, checking if it needs rebalancing because it could happen that one ESXi host is over-consumed while another ESXi host has less resources consumed. DRS runs every 5 minutes, and if the DRS logic determined it could improve the cluster balance, it would recommend and execute a vMotion depending on the configured settings. That way, DRS used to achieve cluster balance by using a cluster-wide standard deviation model.

The New DRS

The new DRS logic takes a very different approach. It computes a VM DRS score on each host and moves the VM to the host that provides the highest VM DRS score.

The biggest change from the old DRS version is that it no longer balances host load directly. Instead, it improves the balancing by focusing on the metric that you care most about: the virtual machine happiness. Important to note is that the improved DRS now runs every minute, providing a more granular way to calculate workload placement and balancing. This results in overall better performance of the workloads.

VM DRS Score

(more…)

Read More

vMotion Enhancements in vSphere 7

–originally authored and posted by me at https://blogs.vmware.com/vsphere/2020/03/how-is-virtual-memory-translated-to-physical-memory.html–

The vSphere vMotion feature enables customers to live-migrate workloads from source to destination ESXi hosts. Over time, we have developed vMotion to support new technologies. The vSphere 7 release is no exception to that, as we greatly improved the vMotion feature. The vMotion enhancements in vSphere 7 include a reduced performance impact during the live migration and a reduced stun time. This blog post will go into detail on how the vMotion improvements help customers to be comfortable using vMotion for large workloads.

To understand what we improved for vMotion in vSphere 7, it is imperative to understand the vMotion internals. Read the vMotion Process Under the Hood to learn more about the vMotion process itself. (more…)

Read More

How is Virtual Memory Translated to Physical Memory?

–originally authored and posted by me at https://blogs.vmware.com/vsphere/2020/03/how-is-virtual-memory-translated-to-physical-memory.html–

Overview

Memory is one of the most important host resources. For workloads to access global system memory, we need to make sure virtual memory addresses are mapped to the physical addresses. Several components are working together to perform these translations as efficiently as possible. This blog post will cover the basics on how virtual memory addresses are translated.

Memory Translations

The physical address space is your system RAM, the memory modules inside your ESXi hosts, also referred to as the global system memory. When talking about virtual memory, we are talking about the memory that is controlled by an operating system, or a hypervisor like vSphere ESXi. Whenever workloads access data in memory, the system needs to look up the physical memory address that matches the virtual address. This is what we refer to as memory translations or mappings.

To map virtual memory addresses to physical memory addresses, page tables are used. A page table consists of numerous page table entries (PTE).

One memory page in a PTE contains data structures consisting of different sizes of ‘words’. Each type of word contains multiple bytes of data (WORD (16 bits/2 bytes), DWORD (32 bits/4 bytes) and QWORD (64 bits/8 bytes)). Executing memory translations for every possible word, or virtual memory page, into physical memory address is not very efficient as this could potentially be billions of PTE’s. We need PTE’s to find the physical address space in the system’s global memory, so there is no way around them.

To make memory translations more efficient, we use page tables to group chunks of memory addresses in one mapping. Looking at an example of a DWORD entry of 4 bytes; A page table covers 4 kilobytes instead of just the 4 bytes of data in a single page entry. For example, using a page table, we can translate virtual address space 0 to 4095 and say this is found in physical address space 4096 to 8191. Now we no longer need to map all the PTE’s separately, and be far more efficient by using page tables.

(more…)

Read More

Hot and Cold Migrations; Which Network is Used?

–originally authored and posted by me at https://blogs.vmware.com/vsphere/2020/03/how-is-virtual-memory-translated-to-physical-memory.html–

A common question arises when customers are migrating workloads between ESXi hosts, clusters, vCenter Servers or data centers. What network is being used when a hot or cold migration is initiated? This blog post explains the difference between the various services and networks stacks in ESXi, and which one is used for a specific type of migration.

How do we define what is a hot or cold migration? A cold workload migration is a virtual machine that is powered off in the entire duration of migration. A hot migration means that the workload and application will remain available during the migration.

Both hot and cold migrations can be initiated through the vCenter Server UI or in an automation fashion using, for example, PowerCLI. To understand which network is used for a migration, we first need to understand the various enabled services and network stack options that are available in vSphere.

Enabled Services

In vSphere, we define the following services that can be enabled on VMkernel interfaces:

  • vMotion
  • Provisioning
  • Fault Tolerance logging
  • Management
  • vSphere Replication
  • vSphere Replication NFC (Network File Copy)
  • vSAN

When looking specifically into workload migrations, three services play an important role. The vMotion, Provisioning and Management enabled networks.

Enabling a service on a specific VMkernel interface, states that this network can now be used for the configured service. While the Management service is enabled by default on the first VMkernel interface, the other VMkernel interfaces and services are typically configured post-installation of ESXi. If you want vMotion or Provisioning traffic to use a specific VMkernel interface, you can configure it like that. (more…)

Read More