VMware Virtual SAN (vSAN) Day 2 Operation – Scaling your vSAN

The number of VSAN customer reference has just passed 5500. This is quite a milestone stating that hyper-converged infrastructure (HCI) and software defined storage (SDS) are being focused. We got definitely a range of options in the market and they offer different functionalities and architectures but anyway they are targeting to start from small and grow linearly in scale when more and more physical hosts are added into the environment. But sometimes you may have multiple instances of HCI deployed in the environment and would like to consolidate it, move it or wipe it. In this blog, I am going to walk through the steps and considerations you should take care when you are trying to do those actions on VMware Virtual SAN (vSAN).

screen-shot-2016-10-31-at-6-45-16-pm

vSAN is radically easy, the functional modules are already inside the ESXi you installed. What you need on top is 1) a network for running VSAN traffic 2) SSD as the cache and HDD as the capacity. VMware offers VSAN Ready Nodes, EVO Rail and VxRail for simplifying the deployment even more. Trust me, if you want to test it in a lab and you have got every necessary pre-requisites ready, it would not take you more than 15 minutes to set it up.

As the slogan of many HCI, “starting from small”. Although when someone asking me about VMware Virtual SAN (vSAN), I would definitely recommend one having 4 hosts as bare minimal. Yes, you would say 2 nodes, ROBO and even 3 nodes are officially supported. But you know, when it is talking in BAU viewpoint, 4 node VSAN is way better than those as it would automatically rebuild stuffs (i mean replicas) in your existing nodes when 1 of the nodes go down. For the other deployment methods mentioned, it is not painful but requires manual steps in rebuilding.

5-repair-objects-immediately

However, reality is reality, some of my customers would still prefer 2 nodes or 3 nodes VSAN for their trial, pilot or development purpose. And thus, they would either have to expand the VSAN when they are (more than likely ?) happy with VSAN, OR to destroy the VSAN (sadly ? but you still need to migrate out your data).

Expanding a VSAN

When we are expanding the VSAN, there are two ways of cox. First, scaling up by adding more disks into the existing servers and second, scaling out by adding more nodes into the existing VSAN cluster. Usually, one likely scales up the VSAN before scaling out. Well… should be legitimate that disks are cheaper than a server node, right?

Scaling UP

So when you are scaling up your VSAN nodes, you can accomplish this by adding extra:

  1. Magnetic disks into disk group (which increase capacity but may degrade performance due to cache:capacity %), you can add as many as seven magnetic disks for each of the disk group
  2. Disk Groups into VSAN cluster (which increase capacity & also the cache tier for maintaining performance), remember that VSAN datastore assume all disk groups are homogenous and all disk groups are actually belonging to one single VSAN datastore

Again, as VSAN is designed to be simpler and more compatible to multiple hardwares, you won’t be warned actually if you are adding more magnetic disks into the environment while the performance is being degraded, and you also won’t be warned even you are adding All-Flash Disk Group with a slower Hybrid Mode Disk Group which definitely causes mismatch performance among different data blocks inside a VSAN datastore.

So the steps for you to upgrade the Capacity Tier is trivial, you can just simply insert more disks into your server and claim the disks at the VSAN interface on the vSphere Web Client. You can see the effective capacity too. It’s not very difficult and following you can refer to the sample steps.

Expand VSAN by Adding in more Disks

  1. Perform the scaling up tasks on all VSAN nodes one by one, for each nodes, perform the following actions to increase the capacity
  2. Put a VSAN Node into Maintenance Mode
  3. Do select the “Ensure Accessibility” on promptensure-accessbility
  4. Shutdown the Host (skip this if your server support hot plug)
  5. Add a new Harddisk to the server
  6. Power On the Host (skip this if your server support hot plug)
  7. Claim the new disk into your existing Disk Groupadd-disk
  8. Exit Maintenance Mode
  9. Check the VSAN Capacity to ensure Disk is managed by VSANadded-disks

Expand VSAN by Adding in more Disk Group

  1. Perform the scaling up tasks on all VSAN nodes one by one, for each nodes, perform the following actions to increase the capacity
  2. Put a VSAN Node into Maintenance Mode
  3. Do select the “Ensure Accessibility” on prompt
  4. Shutdown the Host (skip this if your server support hot plug)
  5. Add new Harddisks including SSD and HDD to the server
  6. Power On the Host (skip this if your server support hot plug)
  7. Create a new Disk Group and Claim the SSD and HDD addeddisk-group
  8. Exit Maintenance Mode

So what if you have already fill up all the 5 (Disk Group) x 7 (Capacity Tier Disk) = 35 Disks in your VSAN Nodes? You can still scale up it by replace the old disks with new larger disks. Well this would be bit more complicated but still can do thru’ the above steps.

Scaling OUT

If performance and data availability are also important when you are trying to upgrade the VSAN capacity, then you should consider Scaling OUT approach. The beauty of HCI is linearly scalability, meaning that adding one more node/block into the environment not only scale the network or storage capacity in your environment but it also scale the performance at the same line. Some vendor claims this as a web scale characteristic but VMware didn’t define a special term for this. So scaling OUT a VSAN just means adding a new VSAN Node into the VSAN Cluster. But as would like to make it bit more complicate, I am expanding a VSAN Cluster from 2-Node Deployment to 3-Node. The steps are as follows:

  1. Install and setup a new ESXi Host
  2. Putting the ESXi into the existing VSAN Cluster
  3. Configuring the VSAN Network of the ESXi
  4. Remove the VSAN Witness Applianceremove-vsan-witness
  5. Remove the VSAN Fault Domains
  6. Resynchronise the VSAN Object at VSAN Health

Shrinking a VSAN

Just opposite the above session, we are trying to reducing a VSAN deployment here. Definitely, I don’t wish you doing this ?, but I think there are still positive reason for doing this e.g. if you are trying to increase the VSAN capacity by removing existing old and small disks in the existing VSAN cluster. And following captures the possible actions we gonna to perform,

Scaling Down

Shrink VSAN by removing Disks

  1. Perform the scaling Down tasks on all VSAN nodes one by one, for each nodes, perform the following actions to increase the capacity
  2. Put a VSAN Node into Maintenance Mode
  3. Do select the “Ensure Accessibility” on prompt
  4. Remove the disk from your existing VSAN Disk Group
  5. Do select the “Full Data Migration” on promptevacuate
  6. Shutdown the Host (skip this if your server support hot plug)
  7. Remove the Unclaimed Disk from the server
  8. Power On the Host (skip this if your server support hot plug)
  9. Exit Maintenance Mode

Shrink VSAN by removing Disk Groups

  1. Perform the scaling Down tasks on all VSAN nodes one by one, for each nodes, perform the following actions to increase the capacity
  2. Put a VSAN Node into Maintenance Mode
  3. Do select the “Ensure Accessibility” on prompt
  4. Remove the whole Disk Group from your existing VSAN Cluster
  5. Do select the “Full Data Migration” on prompt
  6. Shutdown the Host (skip this if your server support hot plug)
  7. Remove the Disks from the Disk Group of the VSAN Cluster from the server
  8. Power On the Host (skip this if your server support hot plug)
  9. Exit Maintenance Mode

So effectively you will see the VSAN capacity being shrunk from the Monitoring Tab.

Scaling IN

Another scenario for shrinking VSAN can be achieved thru’ retiring and removing a VSAN Node from a VSAN cluster. Well, again and again… this is not something I would like to see but do remember that when capacity is reduced, performance is also degraded. (More Node, More performance)

Again, as scaling in 5 node to 4 node, or 4 node to 3 node is very trivial which you just need to put an ESXi host into maintenance mode with full evaluation. So I would like to show the steps from downgrading a VSAN from 3 nodes to 2 nodes. Following please refer to the steps:

  1. Install and setup a new VSAN Witness Appliance
  2. Join the VSAN Witness Appliance into the existing VSAN Cluster
  3. Configuring the VSAN Network of the VSAN Witness Appliance
  4. Put Existing Node (you want to remove) into maintenance mode with “Ensure Accessible”
  5. Move Away the Host from the VSAN cluster and Disable the VSAN network
  6. Reconfigure the VSAN to include witness server
  7. Build the Fault Domains according to the Wizard
  8. Resynchronise the VSAN objects to ensure the VSAN Healthiness

Consolidating a VSAN

It also happens some customer do have two or more VSAN clusters being deployed under a vSphere Environment, so you may have to consolidate smaller VSAN clusters into a bigger VSAN cluster. No worry, actually you know the steps already. Where? It’s just above this session, because:

Consolidating a VSAN cluster is just = Reducing one VSAN cluster + Shrinking another VSAN cluster

Of course you need to do storage vMotion across the VSAN cluster during the consolidating process.

Removing a VSAN

Likewise, if you would like to completely remove the VSAN cluster. You could keep following the Shrinking a VSAN steps to keep reducing the size of VSAN cluster. You may claim why not just vMotion all VM out of the VSAN cluster and disable the VSAN directly. Well… technically you can, but you need spend some time to remove the disk partition on the local disks if you wanna reuse the local disks as normal local datastores.

So I think I have covered quite a lot in Scaling your VSAN Cluster and I wish this would be helpful for you.

 


Comments

Leave a Reply

Your email address will not be published / Required fields are marked *