Understanding the ESXi Network IOChain

In this blog post, we go into the trenches of the (Distributed) vSwitch with a focus on vSphere ESXi network IOChain. It is important to understand the core constructs of the vSphere networking layers for i.e. troubleshooting connectivity issues. In a second blog post on this topic, we will look closer into virtual network troubleshoot tooling.

IOChain

The vSphere ESXi network IOChain is a framework that provides the capability to insert functions into the network data-path regardless of the usage of a vSphere Standard Switch (VSS) or a vSphere Distributed Switch (VDS). The IOChain is a group of functions that provides connectivity between ports and the vSwitch. A port has two IOChains based on the direction to and from the vSwitch. Meaning each port in a set is associated with it an input and an output IOChain. This allows for a modular approach by only including optional elements in an IOChain as configured by the user.

Examples of optional elements in an IOChain are VLAN support, NIC teaming, and traffic shaping. Looking at the high-level components in an ESXi network IOChain, we differentiate between the port group, the vSwitch (VSS or VDS) and the uplink level.

Port group level

This is where an optional configured VLAN is interpreted by the VLAN filter, allowing for VLAN dot1q tags for your port group. The security settings Promiscuous mode, MAC address changes, and Forged transmits are also set at the port group level. The user can also optionally configure traffic shaping, either egress only when using a VSS or bi-directional traffic shaping when using a VDS.

vSwitch (VSS or VDS) level

Incoming packets at the vSwitch level are forwarded to their destination using the forwarding engine. Incoming packets at the vSwitch level are forwarded to their destination using the forwarding engine. The forwarding engine contains port information paired with MAC address information. It’s job is to send the traffic to its proper destination. That can be either a VM residing on the same ESXi host or an external host.

The teaming engine is responsible for balancing network packets over the uplink interfaces. The way it does so is depended on the chosen teaming configuration by the user. The traffic shaper module is added to the IOChain if enabled in the port group level.

Uplink level

At this level, the traffic sent from the vSwitch to an external host finds its way to the driver module. This is where all the hardware offloading is taking place. The Supported hardware offloading features depends strongly on the physical NIC in combination with a specific driver module. Typically supported hardware offloading functions that in NICs are TCP Segment Offload (TSO), Large Receive Offload (LRO) or Checksum Offload (CSO). Network overlay protocol offloading like with VXLAN and Geneve, as used in NSX-v and NSX-T respectively, are widely supported on modern NICs.

Next to hardware offloading, the buffer mechanisms come into play in the Uplink level. I.e., when processing a burst of network packets, ring buffers come into play. Finally, the bits transmit onto the DMA controller to be handled by the CPU and physical NIC onwards to the Ethernet fabric.

Standard vSwitch

The following diagram puts all components together to form the IO chain for vSphere networking using a standard vSwitch:

Distributed Switch

The magic happens when vSphere VDS lets us span network switch configurations over multiple ESXi hosts that can be grouped over multiple clusters even. The VDS is available in the Enterprise licensing model or shipped with your vSAN license. It allows network information to be distributed over all member ESXi hosts, thus providing network configuration consistency in your vSphere environment.

An extensive feature-set is provided for quality control features like Network I/O Control (NIOC), in- and egress traffic shaping and traffic flow monitoring options. Next to that, it allows for additional teaming options like LBT (Load Balanced Teaming) and LACP (Link Aggregation Control Protocol). Review all features that are listed on the vSphere Distributed Switch page.

When we closely examine the upper diagram, we see the additional DVfilter components. The DVfilter is an API framework that is available for VDS and required for NSX. When NSX installs, it introduces additional kernel modules in vSphere ESXi. The summarize-dvfilter command on an ESXi shell shows the loaded DVfilter agents and filters per port.

This exemplary screenshot shows a typical fresh vSphere 6.7U1 installation without NSX or other 3rd party integrations. To give you an example of additional agents, NSX-T installs the nxst-switch-security, nsxt-vsip and nsxt-vdrb modules.

We discern the DVfilter components in Fastpaths, Slowpaths and Filters:

  • Fastpaths – Traffic filter kernel modules.
  • Slowpaths – Used for 3rd party integrations.
  • Filters – Filter placement in a slot for each applicable vNIC.

The ESXi network path is modular in nature. Both IOChain functions and DVfilters are only added to the network path when configured in order to keep the communication path as lean and mean as possible.

Improvements in 6.7

We vastly improved the internals of the ESXi network fundament, the VSS and VDS, over the previous ESXi releases. In the latest ESXi 6.7 version, we changed the VSS and VDS switch module architecture to support new features like MAC learning and IGMP snooping over a VNI (Virtual Network Interface) next to being future-feature proof. The current version of VDS, version 6.6.0, allows you to benefit from these enhancements. Look at the ‘Upgrading vSphere Networking‘ article to see how you can upgrade!

Now we have more insights on how the ESXi network utilizes IOChains to bolt on features and elements to the network path, we can take a closer look at troubleshooting connectivity issues in an upcoming blog post. Stay tuned!

 

Thanks to Broc Yanda for the invaluable input.

Leave a Reply

Your email address will not be published. Required fields are marked *

 

This site uses Akismet to reduce spam. Learn how your comment data is processed.