A Closer Look at the M7i Instance on VMware Cloud on AWS

During AWS re:Invent 2023, the new instance type for VMware Cloud on AWS was announced, and today it is generally available; The disaggregated M7i.metal-24xl instance type! This means customers benefit from the latest compute innovations and freedom of choice when it comes to storage options.

The new M7i.metal-24xl instances are a perfect fit for entry-level SDDCs (2-4 hosts). We see a lot of use cases that make sense for this new option, including your typical general-purpose workloads, but also database backends and AI/ML applications because of the embedded accelerators as part of the 4th generation Intel Xeon CPU. This blog post details technical information with benchmark news and quick demos showcasing the new instance type.

Update May ’24: VMware by Broadcom decommissioned the availability of the M7i instance for VMware Cloud on AWS.

Storage Options

A major difference to current VMware Cloud on AWS instance types is that M7i.metal-24xl doesn’t include local NVMe devices, meaning vSAN is not part of this instance type. For customers and workloads better suited for vSAN, the I3en or I4i nodes are the go-to choices. This new instance type will just be another option for our customers and partners.

VMware Cloud Flex Storage and Amazon FSx for NetApp ONTAP are supported with M7i.metal-24xl. Both are NFS datastores exposed to the ESXi hosts level like ‘regular’ datastores in the vSphere Client and use familiar capabilities like Storage vMotion or storage policies for tagging purposes for example. VMware is not providing managed storage policies for external NFS workload datastores. Workload datastores should be configured by customers themselves to fit their needs. M7i instances can be deployed on a single availability zone (AZ) per cluster at first, meaning no stretched clusters.

See the demo below for an overview of the provisioning process. The latest current SDDC release, version 1.24 comes with key updates to benefit external NFS storage. Also, while vTGW is supported, it is highly recommended to use the new VPC Peering methodology for connecting to Amazon FSx for NetApp ONTAP for VMware Cloud on AWS. This comes with some prerequisites that are discussed in the linked article.

Do note that customers can still mix and match instance types in an SDDC, meaning one cluster can be running M7i instances, while another is using vSAN with I4i instances. The cluster is the boundary for instance type options.

Like all VMware Cloud on AWS instance types, storage capacity is required to store the management plane (vCenter and NSX appliances, and potential integrated services appliances like HCX). As part of the SDDC provisioning process, a managementDatastore is created. The managementDatastore is VMware-managed and part of the first cluster in an SDDC. The logical capacity is 100 TiB. On the backend, a minimum of 6 TiB is reserved for management appliances. If more storage capacity is required, for example with Hybrid Cloud Extensions (HCX) configurations, more reserved capacity should be requested. Customers’ storage requirements for HCX deployments vary based on the scope of the services they plan to run. The managementDatastore cannot be used for customer workloads.

Compute Specs

The M7i.metal-24xl instance uses Intel Sapphire Rapids CPU packages. Sapphire Rapids is a codename for Intel’s fourth-generation Xeon Scalable CPUs. It comes with 48 physical cores, with Hyper-Threading enabled resulting in 96 logical processors. The cores have a base frequency of 3.2 GHz with an all-core Turbo frequency up to 3.8 GHz. Memory capacity is 384GB per host.

Accelerate AI/ML

One of the benefits of running workloads on the 4th generation Intel Xeon CPU package is its Intel Advance Matrix Extensions (AMX) accelerator, which accelerates matrix multiplication operations for deep learning (DL) inference and training workloads. With the 1.24 SDDC version release for VMware Cloud on AWS, the vSphere 8 Update 2 bits are used with VM hardware version 20. Be aware that by default a VM is created using VM hardware version 19 today, so you explicitly need to configure it with version 20 as shown in the demo.

VM hardware version 20 exposes the Intel AMX instructions to the guest OS as seen in the following screenshot. This helps to improve the performance of AI/ML workloads, analytics, and HPC workloads. A great example of this is found in this compelling blog post.
Together with Intel, we have tested a PyTorch-based training and inference for a general-purpose AI/ML use case. The latest version of PyTorch combined with Intel PyTorch Extension (IPEX) was used to test the performance of a general-purpose dataset for training and inference.

Brain Floating Point format, Bfloat16 or BF16 instructions are used in systolic arrays to accelerate matrix multiplication operations, like with image recognition or Large Language Models (LLM). Intel AMX enables BF16. It provides a 2X improvement over the traditional FP32 (Single-precision floating-point format) for training and about 15-20% improvement for Inference.

Using a single VM running both VM hardware version 19 (no Intel AMX) and version 20 (Intel AMX), we trained the app against the MIT Indoor scenes datasetWe measured the training time to see the benefit of using VM hardware version 20 with Intel AMX: 

Intel AMX is providing approx. 50% of training time here. No additional devices required, just using the built-in accelerators that help improve performance efficiency. Once trained, the inference app also showed improvements using Intel AMX. Both the inference time and average images/sec are measured, showing up to 20% performance gain with Intel AMX.

SQL Server Benchmark

SQL Server or any other database backend is a perfectly viable use case for M7i.metal-24xl. Together with our performance team, we ran several tests including SQL Server running on M7i and I4i to get insights on how they compared. There are nuances here; we used the same VM configurations, even though I4i has more compute resources compared to M7i, but we wanted an as close as possible apples-to-apples comparison. Another big difference is that I4i uses vSAN and M7i uses external NFS storage, Amazon FSx for NetApp ONTAP is this scenario. Using DVD Store 3.5, the Order per Minute (OPM) is measured next to cluster CPU utilization. The results show that SQL Server is running flawlessly on M7i when it fits within its compute resource boundaries. More comprehensive insights are found in the Performance Team Blog here.

Demo Provision M7i with Amazon FSx for NetApp ONTAP using VPC Peering

The following demo provides a walkthrough of a M7i.metal-24xl SDDC provisioning process and configuration of Amazon FSx for NetApp ONTAP using VPC Peering. Once in place, a new workload is instantiated and updated with VM hardware version 20, as the default is VM hardware version 19, to support Intel AMX:

Demo using VMware Cloud Flex Storage with M7i

This demo focuses on how easy it is to use the VMware Cloud on AWS M7i instance with VMware Cloud Flex Storage.

 

–originally authored and posted by me at https://vmc.techzone.vmware.com/closer-look-m7i-instance-vmware-cloud-aws–

Leave a Comment

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *

 

This site uses Akismet to reduce spam. Learn how your comment data is processed.