A while ago I did a write-up about PernixData FVP and their new 2.0 release. In blogpost “Part 2: My take on PernixData FVP2.0” I ran a couple of tests which were based on a Max IOPS load using I/O Analyzer.
This time ’round, I wanted to run some more ‘real-life’ workload tests in order to show the difference between a non-accelerated VM, a FVP accelerated VM using SSD and a FVP accelerated VM using RAM. So I’m not per se in search of mega-high IOPS numbers, but looking to give a more realistic view on what PernixData FVP can do for your daily workloads. While testing I proved to myself it’s still pretty hard to simulate a real-life work-load but had a go at it nonetheless… 🙂
As stated in previous posts, it is important to understand I ran these test on a homelab. Thus not representing decent enterprise server hardware. That said, it should still be able to show the differences in performance gain using FVP acceleration. Our so-called ‘nano-lab’ consists of:
|3x||Intel NUC D54250WYB (Intel core i5-4250U / 16GB 1.35V 1600Mhz RAM)|
|3x||Intel DC S3700 SSD 100GB (one per NUC)|
|3x||Dual NIC Gbit mini PCIe expension (3 GbE NIC per NUC)|
|1x||Synology DS412+ (4x 3TB)|
|1x||Cisco SG300-20 gigabit L3 switch|
Note the bold 1.35V. This is low voltage memory! While perfect for keeping power consumption down on my homelab, it makes the concession of lower performance compared to 1.5V memory. Since we are testing FVP in combination with RAM it’s good to keep this in mind.
Pre-build the lab looked something like this:
I updated my FVP installation to the newest version extension (and management server) which contains further enhancements on the new FVP 2.0 features.
It felt like I did fool around with pretty much every ICF (IOmeter Configuration File) out there. Eventually I customized an ICF which was based on a ‘bursty OLTP (Online Transaction Processing)’ workload. OLTP database workloads seemed like a legit IO test as they are a good example of a workload in need of low latency, high availability on data and not so much high throughput.
So, the IO test consists of 2 workers with IO Analyzer using a raw VMDK residing on a iSCSI LUN using the default vSphere iSCSI software adapter. The VMDK has a size of 10GB representing the working set of my fictional application. I made sure my Synology was pretty much idle when performing the tests.
FVP is configured with policy ‘Write Back (Local host and 1 peer)‘ in order to meet the data availability ‘requirement’. I did test with the FVP policy set to write back with zero peers and noticed an improvement because no additional latency is created by writing cache data to the network peer(s). However, I believe this isn’t a configuration which will be used when accelerating an application in a enterprise environment.
The 2 IO workers are configured with the specifications as listed below. The workers are run simultaneously during tests.
Bursty Write Seq
Bursty Write Seq
|Constant Write =||8Kb||100% random write||1ms transfer delay||4 IOs burst length|
|Bursty Write Seq =||8Kb||100% sequential write||0ms transfer delay||1 IO burst length|
|Constant Read=||8Kb||100% random read||1ms transfer delay||32 IOs burst length|
|Bursty Read Seq =||8Kb||100% sequential read||0ms transfer delay||1 IO burst length|
I used the numbers given by ESXTOP, filtered out the useful numbers and did some excel work to create these graphs. The contents of these graphs could be compared to the PernixData FVP ‘VM observed‘ numbers.
I could, as I did in previous FVP posts, use the much more slick looking FVP graphs… But this time I wanted to not take the FVP graphs for granted. Next to that I wanted to be able to do a comparison on FVP modes within the graphs. A concession of not using the FVP graphs is the ability to see the network peer latency so we’ll keep focus on VM observed latency.
First let us have a look at the latency graphs:
The most important thing to notice in the graphs above is that the latency peaks are flattened and consistent when accelerated by FVP. Next to off-course being dramatically lowered!! The part of latency being lowered and consistent is a game changer for your customers’ user experience! Their application will be more responsive and again… consistent in performance!!
Now check the IOPS graphs:
When comparing the IO performance there is a vast improvement noticeable when being accelerated by FVP. I guess I don’t have to point out that a higher number of IOPS is preferred.
Although it isn’t really transparent to crunch down the numbers, it is useful to see the average numbers to indicate a difference in performance between the non-accelerated and the accelerated modes.
|No(!) FVP acceleration||184||1520||23.04||5.77|
|FVP2.0 SDD acceleration||1876||2028||1.20||1.28|
|FVP2.0 RAM acceleration||4544||2262||0.27||0.33|
Again I’m impressed by FVP! From your customers point of view they will notice a great deal of performance improvement and performance consistency while using their applications running on your VM’s!
PernixData’s view (‘de-couple performance from capacity’) is a very interesting one. When adopting these kind of technologies us consultants/architects should rethink our current design principles and building blocks on storage performance.
As always it fully depends on your workloads and what your current experiences are when it comes to storage performance. When you are using an enterprise range array with FC connections to your hosts you probably are more used to sort-off acceptable latency numbers in comparison to when you’re using a mid range NFS array using ethernet connections to your hosts.
But even when using that enterprise array with acceptable latency/performance; what to choose when your storage and/or host assets are financially depreciated or are running out of support. Will you still go for a traditional storage array? Or will you rethink your design principles and building blocks by designing your array to deliver data services only while your performance layer resides at your hosts?
Food for thought… We can state that PernixData FVP will deliver a great job once you’ve chosen it as your performance layer and I’m glad to see it adopted by customers.