This is part 1 of the VMware Stretched Cluster on IBM SVC blogpost series.
PART 1 (intro, SVC cluster, I/O group, nodes)
PART 2 (split I/O group, deployment, quorum, config node)
PART 3 (HA, PDL, APD)
Last year I was the primary person responsible for implementing a new storage environment based on IBM SVC and V7000 and building a VMware Stretched Cluster (a.k.a. vSphere Metro Storage Cluster) on top of that. I would like to share some of the experience I gathered, caveats I encountered and other points of interest. This is by no means a complete implementation guide (go read the Redbook 😉 ). I’ll discuss some of the implementation options as well as failure scenario’s, advanced settings and some other stuff I think is interesting. Based on the content, this will be a multi-part (probably 3) blog post.
Stretched Cluster versus Site Recovery Manager
If you’re unfamiliar with the concepts Stretched Cluster and SRM, I suggest you read the excellent whitepaper “Stretched Clusters and VMware vCenter Site Recovery Manager“, explaining which solution best suits your business needs. Another good resource is VMworld 2012 session INF-BCO2982, with the catchy title “Stretched Clusters and VMware vCenter Site Recovery Manager: How and When to Choose One, the Other, or Both“, however you’ll only be able to access this content if you’ve attended VMworld (or simply paid for a subscription).
I copied this table from the whitepaper. Let’s look at all options briefly.
SRM is better at Disaster Recovery because of it’s advanced recovery plans, whereas Stretched Cluster only leverages HA, check.
Stretched Cluster is better at Downtime Avoidance because of the option to vMotion, SRM always restarts, check.
Active Site Balancing
Stretched Cluster can leverage vMotion, check.
This entry puzzled me for quite some time. Why would SRM be higher rated at Disaster Avoidance? Because of the deliberate distinction between Downtime Avoidance and Disaster Avoidance, I can only assume this rating is based on SRM’s ability to cover longer distances (asynchronous replication), whereas Stretched Cluster is limited to synchronous replication.
Depending on the vendor, Stretched Cluster storage solutions can span up to 100-300 KM, although most vendors recommend staying below the 100 KM mark. The Netherlands only measures 200×300 KM. So depending on your situation and your DR site requirements, Stretched Cluster can be as good at Disaster Avoidance as SRM.
IBM SVC & V7000
I really like the IBM SVC (and V7000 for that matter, they share a lot of code and functionality), I like it a lot! It’s one of the most robust, scalable and management friendly storage systems I’ve worked with. Oh and also… it’s FAST! Back in the days it even held the SPC-1 Performance number 1 position for some time, before the charts got dominated by SSD arrays. Nowadays it’s still listed number 7 as the first disk array, so not too shabby (if you’re into numbers).
Before we can fully comprehend the workings of the Stretched Cluster, we’ll have to understand how the SVC works. The SAN Volume Controller provides a virtualization layer you put on top of your storage. The SVC contains all intelligence, functions like tiering (Easy Tier), compression, snapshotting (FlashCopy), mirroring and storage pool migrations (basically Storage vMotions at the SVC level). The storage arrays underneath the SVC are dumb and only provide RAID sets. There’s a wide variety of IBM and non-IBM arrays supported by the SVC.
The SVC is a symmetric active/active array, supports ALUA (vSphere 5.5) and is based on IBM x series hardware. The next generation SVC/V7000 has been recently announced. They finally were able to get rid of that nasty UPS, sweet!
Let’s dive into some SVC hardware specs (model 2145-CG8). Each individual 1U server is called a node and contains 1x Intel Core Xeon 5600 Series quad-core processor, 24 GB of memory and 4x 8Gb Fibre Channel ports. There’s a CPU and memory expansion kit available, but the extra hardware will only be used for the (licensed) real-time compression feature. Another optional expansion is a 10GE card, but keep in mind the SVC isn’t listed in VMware’s HCL for iSCSI or FCoE vMSC. It will probably ‘work’ (I’ve tested iSCSI), but it’s officially unsupported.
A SVC cluster contains a minimum of 2 and a maximum of 8 nodes. Nodes are always paired, a pair forms an I/O group, so a cluster can contain a maximum of 4 I/O groups. Got it?
Each LUN is owned by one I/O group. In some specific cases (Windows, Linux) you are able to transfer LUNs between I/O group nondisruptively, however this isn’t supported with vSphere. Also, you could pass through I/O group X to get to I/O group Y, but this undoubtedly has a performance penalty.
Each LUN has a preferred node and a non-preferred node, nodes can fulfill both roles for different LUNs. LUNs are divided evenly amongst the nodes.
Write I/O is synched between nodes. As soon as the I/O is safe in cache on the partner node, an ACK is sent to the host. Subsequently the preferred node is responsible for destaging the cache to the disk subsystem. Read cache is always node local, so it’s preferred to use one node, to maintain a high cache hit ratio. Here’s an example of a write I/O flow.
- Host sends write I/O to SVC
- Node sends I/O to partner node
- Partner node sends ACK
- Node sends ACK to host
- Preferred node for LUN 1 destages the I/O to disk
Keep in mind that if one node fails, the caching mechanism on the remaining node will go into write-through mode.
That’s all for now. In PART 2 we’ll have a closer look at implementing the SVC in a split I/O group setup (required for Stretched Cluster), the quorum site and actual volume mirroring.