vSphere 6: mClock scheduler & reservations

“Storage IO Controls – New support for per Virtual Machine storage reservations to guarantee minimum service levels.” Is listed as one of the new features of vSphere 6.

The new mClock scheduler was introduced with vSphere 5.5 and as you might have guessed, it remains the default IO scheduler in vSphere 6 (don’t mind the typo in the description).

 

mClock advanced setting

 

Besides limits and shares, the scheduler now supports reservations. Let’s do a quick recap on resource management.

SHARES

Disk shares
“Shares is a value that represents the relative metric for controlling disk bandwidth to all virtual machines. The values are compared to the sum of all shares of all virtual machines on the server.”

 

 

As with all ‘VMware shares’, they only come into play when there’s resource contention. The share value is also used by SIOC, to calculate the IO entitlement relative to the total number of shares of all hosts accessing the datastore.

SIOC

 

 

LIMITS

Disk limit

 

“Limit specifies an upper bound for disk IO that can be allocated to a virtual machine.”

 

 

 

 

If you’ve worked with disk limits pré vSphere 5.5, you might know that “all limit values are consolidated per virtual machine per LUN”, as described in KB1038241. The mClock scheduler applies shares on a per virtual disk basis.

 

During my tests I noticed the mClock scheduler behaves a bit weird when there’s only one VM issuing IO. The same thing happened in vSphere 5.5 if I remember correctly. Let’s say I have 1 host, 1 datastore with 1 VM running IOmeter. The disk IOmeter uses is limited to 500 IOPS.

IOmeter

 

 

 

 

 

 

 

 

 

 

Now in the background let’s fire up a second IOmeter, so now I have 2 VM’s running on the same host/datastore. Let’s have an other look at the IOmeter of the first VM.

IOmeter

 

 

 

 

 

 

 

 

 

As you can see, both the IOPS and the latency stabilized. I’m not sure what causes this, my best guess is the scheduler only kicks in when a certain amount of IO switching takes place.(Disk.SchedQControlVMSwitches?)

If you would like to reproduce this in your lab, make sure you set the IOmeter value ‘# of Outstanding I/Os per target‘ to at least 2 on the first VM (the VM your are limiting). Also, make sure you use an IOmeter ‘Access Specification’ of 16K or smaller. mClock uses a 32K IO size, so selecting anything higher in IOmeter  will cut your total IOPS in half.

 

RESERVATIONS

On to the new stuff. I got excited when I read the following in the new ‘vSphere Resource Management’ (beta) document. “Click the Virtual Hardware tab and expand Hard disk… …Under Reservations, click the drop-down menu and select the reservation to allocate to the virtual machine…”

Alas, there’s no such option!

Virtual disk properties

 

 

 

 

 

 

 

 

 

 

 

In the Google Quest for the VM advanced setting that followed (I couldn’t find any official VMware documentation), I came across a comment of Cormac Hogan. The setting I’m looking for is sched.scsi0:0.reservation. I was unable to add the value through the Web Client, so I had to hack it manually into the .vmx file.

ESXi CLI

 

 

 

 

I will now spin up 3 VM’s, with the same IOmeter settings (10 outstanding IO’s). The first VM has an IOPS reservation of 1500.

 

IOmeter IOmeter IOmeter

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

OK, this doesn’t seem right! All 3 VM’s seem to receive the same amount of IOPS!

Only when I really start hitting the IO queue, by increasing the number of outstanding IO’s even further (from 10 to 30), I see the reservation on my first VM getting fulfilled.

IOmeter IOmeter IOmeter

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Conclusion

Judging from these tests, it seems like the IOPS reservation isn’t always on (it also seems not wasteful, like a CPU reservation) and kicks in at a certain level of contention. Do note this post is based on the vSphere 6 RC version. At this time, a lot is unclear about the workings of the mClock reservation. Perhaps the synthetic workloads of IOmeter are messing things up, maybe I’m using the wrong settings and parameters. We’ll just have to wait on VMware for further documentation on the matter.

Show 3 Comments

3 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

 

This site uses Akismet to reduce spam. Learn how your comment data is processed.