Cost Reduction for Home Lab… Instant clone ESXi for vSAN testing :)

Since the release of VMware vSAN (5.5), if you want to test it in your lab, your would need to have 3 hosts. Of course, when come to version 6.1, you can have ROBO vSAN deployment which let you running vSAN on two nodes. Some of my friends, do really buy physical hosts for their lab. Yes, of course you can use VM as your vSAN hosts, but usually the constraints come from the memory you have in your lab. My friends thus upgrade their Intel NUC which supports 16GB Maximum before to SuperMicro servers which support 128GB. Well, I would like to have it too but just I have limited budget and room for such a lab in my home. This is why I got to think my own way to test vSAN, and what I want to test is not ROBO or 3 node VSAN. I want to deploy a production ready vSAN which I think should be composed of 4 nodes or more. So, my solution is by leveraging instant clone.

Instant Clone

Instant Clone technology was named as project Fargo, it is a new cloning technology since vSphere 6.0. This is a hidden feature which has not be used widely as you could not use it from GUI (vSphere c# client or Web Client). But the beauty of the new cloning technology let you save a lot of Memory by using Copy on Write (COW) like mechanism on Memory. So theoretically, when you are cloning 100 child VMs from a 4GB memory parent VM thru’ Instant Clone, if there are no memory block changes in the child VMs, you just need the same 4GB memory (comparing with 404GB memory).

And this is why I would like to try cloning Nested ESXi Hosts in my lab to test the vSAN! But before that I need to clarify that officially this feature is supported and being used in selected solution only, like Horizon View 7, Big Data Extension and vSphere Integrated Container. This is because of the fact that as we are using COW on memory, it may not be suitable for running traditional workloads which may need to be restarted at OS level on and off.

I have a vCenter 6.5 in my lab and I have a Physical Host which is equipped with 32GB memory. I would like to leverage it for building my vSAN cluster. And I would put 7 nested ESXi running on the host in total (counting the parent VM in blue), vSAN 6.2 will be enabled on the 6 child ESXi Hosts.

Following image illustrates the simple architecture I would like to build out:

Getting Start

As mentioned, as we have no GUI for performing the instant clone, we would have to leverage a VMware fling tool for performing instant clone. While you could find more information from the VMware Blog, the tools we used can be download from HERE is actually an additional module for PowerCli. As the prerequisites for forking out an instant clone VM, you would have to prepare the following in your environment:

  1. vSphere 6.0 or later
  2. PowerCLI 6.0 or later

By the way, actually I’m not the first guy trying to instant clone out ESXi and I was trying to follow William Lam’s Blog to do that. You can refer to his blog HERE, but perhaps due to version changes I have to edit some scripts to make it actually working in my environment. And we will following the following steps in deploying it:

  1. Install Instant-Clone PowerCLI Module
  2. Deploy and Edit the Parent Nested ESXi VM
  3. Prepare the Parent Nested ESXi VM for Instant Clone
  4. Instant Clone 6 Child Nodes
  5. Connect Cloned ESXi to vCenter
  6. Configure the ESXi for VSAN Deployment
  7. Create VSAN Cluster

Install Instant-Clone PowerCLI Module

So let’s start preparing your environment for the testing now. As instant clone is build in at vCenter and vSphere side, we don’t need to amend anything from server side to enable it. Instead, we need to prepare the PowerCLI environment and install the fling Instant Clone Module. It is actually named “POWERCLI EXTENSIONS”, do download it HERE.

Actually you can following the instruction from the fling, but I would rather using a simpler method. You can just:

  1. Download the zip file from the URL
  2. Unzip the Package
  3. drag and drop the Modules “VMware.VimAutomation.Extensions” into the following Directory.

C:\Program Files (x86)\VMware\Infrastructure\PowerCLI\Modules\

You got your machine equipped with Instant clone cmdlet! Simple enough?

Deploy and Edit the Parent Nested ESXi VM

You can either deploy your own new Nested ESXi which is just by setting up a new VM on top of an ESXi. But an easier approach will be download it from William Lam’s Post HERE, he has prepared us with nested ESXi 5.5, 6.0 and 6.5. I would use the 6.0 Update 2 version for building my vSAN 6.2. I would not copy the steps in deploying OVA file as it is quite trivial. In running the instant clone script later, we need to refer to the name of this Parent VM, so do record it even you can always change it.

Do check the check box for enabling SSH, we need it

But the point is, before you are going to power on the Nested ESXi, you better configure more memory say 8GB Memory to the host. The Default 6GB memory configured would fail the vSAN enablement. As from the OVA prepared by William, the Nested ESXi is equipped with 3 disks, one for installing the ESXi and the other two for vSAN. So we don’t need to add Hard disks to the Nested ESXi we cloned out later for creating the vSAN cluster.

So you got this step done, we don’t need to add the Nested ESXi Parent VM into the vCenter, this is because we are going to remove the networking information from it for instant cloning.

Prepare the Parent Nested ESXi VM for Instant Clone

After the Parent VM has been powered on, do not change the configurations say ESXi Shell and SSH Shell. The network configuration should have been already in place if the OVA deployment has been successfully completed.

Then we need to prepare the Parent VM for Instant Cloning, William has prepared us some sample scripts for instant cloning the ESXi VM. You can download it from the GitHub. In this step, we just need the “prep-esxi60.sh” script.

As mentioned a bit, this prep-esxi60.sh is used for preparing the Parent VM for instant cloning. The script targets to remove host specific configuration say networking and UUID information. To be specific, the script disable the hostd daemon, unload network and disk driver and remove the VM Kernel adapter. And this is why I mentioned, you don’t have to add the parent VM into the vCenter as it will be disconnected after running this script.

As said, I have amended some scripts to make instant cloning working in my environment and those are the:

  1. vmfork-esxi60.ps1
  2. post-esxi60.sh

So you can just upload and run this “prep-esxi60.sh” on the parent VM without editing. And be careful that this just works for the ESXi6.0 OVA, for ESXi6.5 the module and method for preparing the host is not very the same.

You would see similar screen after running the script, as vmk0 is removed, you should also realised that you cannot ping this parent VM. Your parent VM is then prepared for instant clone.

Instant Clone 6 Child Nodes

Then, we can instant clone the 6 Child Nodes out with this prepared parent VM. This time you would need the vmfork-esxi60.ps1. As said, originally I would like to use the script directly after editing the file path and vCenter connection information but I could not get it pass. It looks like the cmdlet has been changed due to version update (I mean the fling Powercli module) such that the script has to be modified a bit. Following please refer to my edited script:

While the green columns have to be changed for your environment. The red circles ones are the lines you need to change if you are using latest PowerCli and latest VMware Fling Module.

Changing it from the original (William’s Version) one line script:

$quiesceParentVM = Enable-InstantCloneVM VM $parentvm GuestUser $parentvm_username GuestPassword $parentvm_password PreQuiesceScript $precust_script PostCloneScript $postcust_script Confirm:$false

To two lines:

Enable-InstantCloneVM -VM “$parentvm” -GuestUser “$parentvm_username” -GuestPassword “$parentvm_password” -PreQuiesceScript “$precust_script” -PostCloneScript “$postcust_script” -Confirm:$false

$quiesceParentVM = Get-InstantCloneVM

And from the above script actually you can see the “-PreQuiesceScript” and “-PostCloneScript” attributes, and actually these two attributes define what tasks are being done before and after the instant Clone. We would keep the “-PostCloneScript” running the “pre-esxi60.sh” written by William, while again I need to change some lines in the “-PostCloneScript” post-esxi60.sh to make instant cloning working better.

I changed the following part of the script from William

# setups VMK0
localcli ${RESOURCE_GRP} network vswitch standard portgroup add -p “Management Network” -v “vSwitch0”
localcli ${RESOURCE_GRP} network ip interface add -i vmk0 -p “Management Network” -M ${mac}
localcli ${RESOURCE_GRP} network ip interface ipv4 set -i vmk0 -I ${ipaddress} -N ${netmask} -t static
localcli ${RESOURCE_GRP} system hostname set -f ${hostname}
localcli ${RESOURCE_GRP} network ip route ipv4 add -g ${gateway} -n default

To following, so actually William’s script try to recreate the VMK0, but I found it’s working yet not taking up a new MAC address such that the inter host network connection fails. Thus, I have to ignore the VMK0 (which has been deleted already) and create a new VM Kernel Adapter (VMK1) instead.

# setups VMK1
localcli ${RESOURCE_GRP} network vswitch standard portgroup add -p “Management Network” -v “vSwitch0”
localcli ${RESOURCE_GRP} network ip interface add -i vmk1 -p “Management Network” -M ${mac}
localcli ${RESOURCE_GRP} network ip interface ipv4 set -i vmk1 -I ${ipaddress} -N ${netmask} -t static
localcli ${RESOURCE_GRP} system hostname set -f ${hostname}
localcli ${RESOURCE_GRP} network ip route ipv4 add -g ${gateway} -n default

After you edit both scripts. Then you can open a PowerCLI console and run the edited script to instant cloning the Nested ESXi Hosts. On successfully cloning, you should see the following. And 6 Nested ESXi will be cloned out and visible in the vSphere Client.

You should be able to ping to all the nested ESXi and even connect to configure it like normal ESXi. The hostname will be in place either.

Connect Cloned ESXi to vCenter

So of course, we need to add the cloned Nested ESXi into the vCenter for management. You definitely can use the GUI to create cluster and add hosts one by one. But I’m bit lazy to do that so I use PowerCLI to help:

1..6| foreach {$esxi = “vsanesx0$_”; get-cluster -Name VSAN| Add-VMHost -Name $esxi -Force -RunAsync -User root -Password abcd1234}

This will help me adding all the hosts into the vCenter. By the way, you can check the memory consumption of the 32GB ESXi Host, we got 7 x 8GB Nested ESXi Hosts running on top, but you can see the memory usage is far from 56GB Memory. Cool enough?

Configure the ESXi for VSAN Deployment

So one last step before we can deploy the VSAN, we just need to enable VSAN traffic among the Nested ESXi Hosts. To make stuff simple, I will reuse the Management Network for all the vMotion, VSAN traffic. I think good enough for POC? Again, you can use GUI to do that, but I prefer my PowerCLI way

get-vmhost vsan* | Get-VMHostNetworkAdapter -Name vmk1 |Set-VMHostNetworkAdapter -VMotionEnabled $true -VsanTrafficEnabled $true -Confirm:$false

Such then, everything is good to go for vSAN deployment!

Create VSAN Cluster

This is too simple and you just need to enable it at the new VSAN cluster we created and Hosts being connected to.

BOOM! VSAN 6.2 running on 6 Instant Clone ESXi Hosts.

Conclusion

Well, this is definitely not something supported. But again, to test out VSAN it does not have to be $$$, now I got 6 (actually i did a 9 node too) ESXi hosts running on 32GB Memory ESXi Host. I think many laptop of PC is running with 32GB memory already now right? So you can also test it on your machine. I Wish this is helpful for you!

P.S. Actually my 32GB Memory ESXi Host, is another Nested ESXi Host too 🙂

Setup vSphere Integrated Container v0.8.0-rc3
vCloud Director 8.10.1, best way to learn cloud AND… – Part 1

Leave a Reply

Your email address will not be published / Required fields are marked *