vRealize Operation Manager, Who Needs it?

In these years, Virtual Machines appear on top of Physical Hosts, Virtual LUNs appear on top of Physical Storages, Virtual OS and Applications (Okay, Dockers) appear on top of VM/Physical Hosts, even we have build virtual networks over physical ones. There is a trend of building stacks of abstraction layers one over another. If you have or are exporting Cloud Native Application or Containers Architecture like me, you may be dizzy about all the abstraction layers (well… at least I were).

containerarch31

But I am not going to talk about Cloud Native Application stuffs in this block, I mentioned the above because whenever an abstraction layer appears actually I saw most people just then skipped the underneath layer(s). Life can be good when you didn’t abuse or exhaust your underneath layer(s), e.g. you can always thin provision more and more stuffs until the underneath storage used up, but if your underneath storage died… all the above abstraction layer(s) won’t stay alive.

1355761810568

And thus, in this blog, I want to state the increasing importance when we are building and stacking abstraction layers. This is explicit actually, in the ages when we are having physical machine, network, storage, the monitoring is comparatively trivial. When VM age comes, so how to monitor VM abstraction layer over the physical layer becomes important, this is why new solutions like vfoglight, TurboVM…etc. come out. Story continues when one adopts virtualised storage, network, OS and application, you would need corresponding tools for monitoring those too. Definitely, it is good to have one monitoring tool to keep an eye on every abstraction layers and components inside . But this will lead to an explosion in log messages and alerts which could be non-organized and messy. So what will you do mostly? Well… I agree with the approach taken by many of my customers, to:

  1. Drop the monitoring tools
  2. Setup an Email Rule to Delete all monitoring Alerts

So either way, we just ignore any warning and alert messages in the environment. Of course everyone know this is not proper, but anyone being the system administrator or operator receiving thousands of such messages everyday, they will definitely force to do so.

all-the-things

This is why I think how well a monitoring tool is not defined by how many stuffs it can receive message from, but should be how useful every single message is sent out from the tool. Or course, some tools in the field are highly flexible that you can and thus have to do some customisation and development before the tools can give you meaningful alert messages. While some other tools give Out-Of-The-Box Intelligence and extend through Plugins. While vRealize Operation Manager is one of the later tools type.

vRealize Operation Manager (vROPS)

This is a default monitoring tool for vSphere environment recommended by VMware definitely, when people usually compare vROPS with Solarwind, TurboVM, vFoglight or lately with Elastic Search, I think we better understand the nature and objectives of vROPS rather than a technical feature by feature comparison. To make it explicit, from field experience, if you are looking for:

  1. Real time monitoring (I mean <1 minute interval)
  2. Unstructured logs monitoring (You know, Splunk stuffs)

Then, you can skip this blog post and vRealize Operation Manager. Because vROPS is NOT designed to handle to above requirements, you would need some other other tools if you are really interested in the above requirements, say VMware Log Insight for Unstructured Data Analysis. But what vROPS does or what vROPS wants to do for you is the following instead:

  1. Alerts Reduction
  2. Proactive Environment Monitoring
  3. Capacity Planning
  4. Environment Reporting

Well, the BAU tasks above are tedious but also actually consumed most of my customer’s time in their normal daily work. As a system administrator or operator, the most valuable tasks would not be fixing issues or following service requests. Their talents should be working in future project planning and deployment for fulfilling User Requirements which is more revenue driven, which is more valuable. So if you are belonging to one of the categories below, you need to have vROPS helping you out:

  • Category 1: Furious vSphere User, tired of using vSphere Client for Troubleshooting
  • Category 2: Panic vSphere User, scared unexpected issue happening in the environment
  • Category 3: Confused vSphere User, in reading Performance Metric in vSphere Client for Capacity Planning
  • Category 4: Tired vSphere User, in copying and pasting VM information into daily/weekly/monthly Report

matrix-morpheus

Yes, if your are one of the tired, scared or confused user of vSphere, you would need vROPS. It can help offloading and solving a lot of problem you are facing day by day. So instead of listing all the function and feature one by one which you can read from a lot of blogs already, I would like to illustrate how vROPS can help a System Administrator every day.

Give Meaningful Health Status And Alerts

So first of all which quite a daily routine task that a system administrator or operator has to perform would be health checking, I have blogged the Daily Health Check Method using RvTools and vCheck PowerCLI scripts before. But here, we are referring to a more ad-hoc and close to real time health status monitoring.

I don’t know if you are agreeing or not, but most operation people I met, they actually will NOT drill down or deep dive in an environment to dig out problems as a daily routine. I mean actually this is not something expected to be done daily actually. Instead, what they have to learn about is actually something comprehensive yet simple which is an overview of the environment they monitoring. This does not require lot of metrics, messages or logs, this needs in opposite holistic diagram, charts or maps which can let them spot abnormalities and issues in the environment.

Moreover, monitoring pattern also has to change from reactive to proactive in order to prevent something happening and discover problems before your end user telling you, which is always the case. This is why the analytic components in a monitoring tool have to be intelligent enough to discover future issues. By the way, do not think this is a Minority Report feature. Thinking it as stock market analytic tool, we just need some mathematical formula to calculate what the performance metric should be expected and if the actual metric differs from it for too much then it indicates a possible issue. I think stock price tool like those bands working in the same way?

VROPS is one of the tools giving you health status based on the above design principles. This was a unique feature when VROPS was still called VCOPS (vCenter Operation Manager), i.e. long long time ago. So what making VROPS still unique now? Two things:

  1. Large Eco-System in Extendability
  2. Out-Of-The-Box Configuration

The above features making monitoring ridiculously simple, everything is so straight forward enough since initial deployment. So followings are the high level (and detail) steps:

  1. Download the OVA/OVF based VROPS Virtual Appliance from vmware.com
  2. Import it into your vCenter
  3. Power it on
  4. Initial Wizard to setup the vROPS (5 pages in the Wizard in total)
  5. Add the vCenter Entry you want to monitor
  6. Start Monitoring

With the knowledge of VMware Products from VMware, VMware Realize Operation Manager (vROPS) provides you the best recommendations again threshold definitions, resolution suggestions and resource optimisation indices. Such that you don’t have to spend time in defining your own monitoring policies in most of the cases. You don’t have to redefine or reconfigure anything when you are adding another ESXi host into your vCenter, adding or removing a Workload in your environment.

screen-shot-2016-10-28-at-9-58-27-am

Provide End to End Visibility

So what if you also wanna monitor other stuffs like Hardware Storage, Network or Application running in guest OS? As said, VROPS can be ramped up with management packs to cover those, the setup is also intuitively easy. But I would like to state, it’s not enough to monitor those as separate objects, VROPS has the intelligent to link up Physical Stuffs with Virtual Objects and let you know the relationship among them. It gives you a better Visibility of your environment say if you want to know:

  1. Which VM will be affected if there is a storage port failed
  2. What’s the capacity usage of a Thin Provisioned LUN supporting your datastores

Well, these can be done without VROPS, yet very very tedious and could spend you a lot of time to figure out the mapping. But with it, it’s just displayed in a single Dashboard.

screen-shot-2016-10-28-at-9-59-37-am

Scenario Based Capacity Planning

Many Customers of mine have their own Capacity Planning Standard in their Virtual Environment, most of those are quite static. Say they define the CPU and Memory Consolidation Ratio to some fixed 1:4 and 1:1.5 Magic Numbers. Well, technically, this is not good as it could waste a lot of resources and didn’t take care of other metrics say Storage and Network Capacity. To be specific, I mean the Contention when you are adding more and more hosts in the environment without upgrading the SHARED Storage, Fabric and Network. VROPS gives you a comprehensive Capacity overview of all these factors which should actually be considered.

But I know most of the company Capacity Planning Policies just won’t simply changed because of that. Yet, I would recommend you to at least using the Scenario Based Capacity Planning feature in vROPS to visualise how the environment will be improved or impacted when you are doing some Capacity Planning tasks for your new projects which could Add/Reduce VM, ESXi or Storage.

screen-shot-2016-10-28-at-10-03-41-am

Environment Reporting

Last but not the least, another most clumsy task an administrator/operator has to be performed is reporting. As BAU team, we have to create reports for Capacity, Usage, Compliant, Efficiency, Configuration… etc. on and off. This can be done again actually without vROPS, but you need to have some scripting skill set like running and customising the vCheck PowerCLI report or SQL Reporting Skills to directly extracting information from Database to create a Report. However, if these sounds too much effort, then you should again let VROPS help you. Out-Of-The-Box, we have few twenty Reports you can leverage and you will get even more when you are installing 3rd party management packs. If these are really not enough, we provides a very intuitive Report Customisation Wizard to let you create your own Reports.

screen-shot-2016-10-28-at-10-01-39-am

Conclusion

So to align with the Topic of this Blog, vRealize Operation Manager, Who needs it? I think if you are one care about your environment and you want to have a useful monitoring tool back. You would need it. If you are one care about what are you doing and you want to do things efficiently with your edge, you would need it. Let vROPS helps you and gains back your time!

one-does-not-simply


NSX in Real Life

NSX has been released for years, it is commonly known as a Network Virtualization Solution. Of course, more people thinks it as an “Enhanced” vCloud Networking and Security (VCNS) Solution previously VMware provided through vShield Stacks. But actually it aims higher, as I have been using and deploying VCNS and NSX, from my view point they differ in a way that,

  1. vCloud Networking and Security (VCNS) solution is more an VNF (Virtual Network Function) Solution
  2. NSX instead targets to establish a NFVI (Network Function Virtualization Infrastructure)

So here comes some terms including VNF, NFV, NFVI and you may also hear about SDN (Software Defined Network) and NV (Network Virtualization). While it’s honestly very easy to mix up in between the terms. I wanna take chance to explain among the terms. So all these terms are actually not defined by Vendors or Sticked to any product actually. These instead are defined by Standard Frameworks. And they should be understand with the following correlation

  1. Difference between SDN (L2-L3 network) and NFV (L3-L7 Network Functions) in a NV design
  2. NFVI is a platform setup for running VNF utilities to provide NFV functions

And following diagrams give you a graphical visualisation of the above two statements.

SDN and NFV Relationship

vcd-and-nfv-overview-v0-1

NFV, VNF and NFVI Relationship

vcd-and-nfv-overview-v0-2

Remember again that the above definition is NOT vendor specific, as if any solution is sticking and adhering to the framework and Definition mentioned above, those can contribute to this SDN + NFV + VNF + NFVI + NV ecosystem. And VMware vSphere + VSAN + NSX is one of the solution targets the build the NFVI Platform which allows other 3rd party solution provider to run their VNF solutions on top of it. Yes, you may discover that the so called VNF solutions should be x86 VMs if you are running your NFVI in a x86 Platform (which is the most making sense CPU architecture platform currently). The idea of running VNF solutions on x86 definitely because one wanna offload the high cost of ASICs network equipments.

vcd-and-nfv-overview-v0-3

The Red Highlighted NFVI platform is what VMware Solutions can provide.

After talking a bit on the whole Network Virtualization Framework, I would like to come back to the NSX topic particularly. As Mentioned in the previous vSAN Blog, I am really pleased to see some actual deployment of NSX in my region and honestly I think it’s a very successful usage of NSX. Just like how VSAN targets, NSX is definitely NOT design to complex your environment. And how I define the “successful usage”, I would be more biased in the solution fit into the requirements from my customer who actually want to:

  1. Gain the ease in Network Function Provisioning to fulfil Application User
  2. Creating Identical network environments for each Application Development Stages
  3. Networks should be running across two Physical Datacenter

For their new project which has a really tight time line and resources. So, to be coherent to the topic of this blog “NSX in Real Life”, let me illustrate the NSX Journey of this customer and how NSX is turn out benefiting them in a Real Life Daily Use.

Design And Deployment

So the first and foremost thing we have to do, definitely would be having a proper design of the NSX. We already mentioned that “Multiple Overlapping IP Networks Environments Across Two Datacenter” have to be built for application deployment stages. Of course, there are actually many ways e.g. OTV, VPLS…etc. for doing NOT just NSX, but why NSX? This is because it gives a very low barrier in Physical Network Requirements and more important the Ease in Expanding and Reconfiguration of the Networks.

nsx

The Diagram Above is a high level Design I worked out for my customer. You could spot some more Requirements And Constraints that I didn’t detail before, including:

  1. Requirement of VDI deployment
  2. Constraint of using Metro Storage Across two Datacenter
  3. Two types of Hosts, one for running SQL Servers and another for others

In Order to implement a solution to fulfil all of the above, three clusters are used. Where there are one (1) management cluster and two (2) resource clusters, this is allows a great separation of workloads to allow management machines securely isolated from the Desktop Workloads and User Workloads running in the resource clusters. You should be aware that this is actually a best practice VMware recommends for VDI and Cloud Deployment.

Due to the deployment size, we don’t have the luxuriance to have a separate VDI environment and thus we collocate the VDI workloads into the “Normal” workloads in the Resource Cluster 2 but secure the resource with a dedicated Resource Pool. As all the vSphere Clusters are deployed on top of the Metro Storage, so all the workloads are protected cross site.

After handling all the Compute Consideration in the Design, then we should think about how the networks to be designed in the environment. As mentioned, the User Workloads Running in the Resource Clusters have to be run in overlapping Networks. And definitely we should not segregate the networks between the SQL Server running Resource Cluster 1 and Normal Server running Resource Cluster 2. This is why we set up a  VDS (vds01) for NSX networks across the Resource Clusters while the DMZ Networks are just running in a separate VDS (vds02) in Resource Cluster 2.

You can see how traditional networks and NSX networks are being used together here, it is as always not an “Either Or” selection between using SDN or Traditional Network. But it actually can be used together, while here, the traditional network help extending the Management, vMotion and DMZ Networks, the SDN help building overlapping IP network environments as the following diagram

vdi

We still need non-overlapping network for the VDI Jumpbox Hosts which let user accessing different environments and such Application Stages. But this makes deployment of Application Comparatively Easy, since we don’t have to change anything (IP or Hostname) when we migrate codes from one environment to another.

Daily Usage

After building the overlapping IP Network Environments, then my customer deployed honestly really a lot of VNF (NSX Edge) over the NSX Platform. It is very nice to see that they can handle Application User Requirements very quickly now, for example, a user may requires:

  1. A Network Load Balancer for their Servers
  2. VPN Connection from Datacenter to Other environment
  3. L2 Connection from Datacenter to a Remote Office/ Branch Office

Following is a summary of the NSX Usage in my Customer Environment. Again I want to state that their environment is not very big one (only 12 ESXi hosts), but they comprehensively leveraged the NSX function in their environment to fulfil most of the network requirements from their application users.

nsx-usage

Finally, many people would worry if Operation and Monitoring would be difficult after using NSX. Well to be fare, if you are using a traditional monitoring way to monitor a SDN network, may be this could generate a lot of effort. But if you are using a suitable way to monitor a SDN solution, you could see the magic and ease in monitoring. This is another life example when my customer picked up the vRealize Operation Manager and vRealize LogInsight to Monitor the environment, they can trouble shoot stuffs promptly now and save a lot of time in log diagnosis.

tumblr_n5exd6x85y1qfbtwqo1_500

Actually, VMware does have another new product named vRealize Network Insight which provides even better and more comprehensive visibility of both physical and virtual network environment. But what I want to state again, there are no best tool for everything but instead do use the most suitable tool upon your requirements.


vSAN 2 Nodes ROBO Setup

VMware Virtual SAN has become a Hot Topic lately. I see many Bloggers try comparing Similar Hyper Converge Infrastructure Solutions e.g. Nutanix in performance level. Well, that’s not very my point to write this blog. And actually instead of competing and arguing which technology is better, I am personally more pleased as the whole HCI world is growing. SAN has done so great a job in the past decade, but in the fast growing world, I believe Simplicity and Scalability would be the key elements of an infrastructure.

screen-shot-2016-10-13-at-9-03-21-pm

To branch out a bit, I actually have a customer using the VMware NSX happily in their environment. Trust me, they do not have a huge farm, to be precise, they only have 12 ESXi hosts in their environment. But their IT team got a lot of On-Demand requests from Users for new Networks, Routing, Firewall, VPN or Load Balancers. NSX help them so much in fulling these requests with speed and ease. And when they are expanding their environment, well… they just need to add a new host under the same NSX cluster and all DONE.

diagrams-v1-6

For Virtual SAN (VSAN), it is being developed in the same idea to simplify the infrastructure. The same customer mentioned lately had a new use case, they are setting up a new remote office away from their datacenter. So to simplify stuffs but at the same time to maintain the availability of the VM, VSAN ROBO setup is so fit a solution for them. They immediately start the evaluation and I was the one to help designing and configuring it. Turn out, I used less than a day from setup of the hardware to actually migrating VM on top and running on it. I would like to provide a very quite overview of the Design and Deployment for your reference which let you know VSAN is as simple as such.

cfkavlfukaikhgg

Design:

On preparing the VSAN 2 Nodes ROBO deployment, as always of course you should try studying this two guides first, 1)Solution Overview, 2)Design Guide. But instead of making you have to go through so many lines. I would like to share a summary of the design blueprint as following:

vsan-design-v2-1

Definitely in order to create a VSAN ROBO deployment, we need to have VSAN 6.1 onward. This means we need to have vCenter 6.0 Update 1 onward, while in this blog, I am using vCenter 6.0 Update 2 which leverage VSAN 6.2. As from the design, we target to create the VSAN 2 Nodes ROBO deployment at the Remote Office 1. Definitely, it is important that these two ESXi hosts should be in the VSAN support list and you can check from here. Afterwards, do remember you would need to have a Witness Server Remotely, you got two options:

  1. A dedicate Physical ESXi Host
  2. Witness VM (which is a nested ESXi Host)

For my deployment and actually I think even more likely in an actual deployment, the 2nd Option would be far more feasible in cost, ease of deployment and even higher availability as a VM can always be protected by vSphere HA if it is run under a vCenter environment (which is actually recommended). But a dedicated host would refer to a more static configuration, higher cost, lower availability and longer lead time in deployment.

In Network Design, there are more considerations we need to be aware of to ensure the configuration is based on the VMware Recommendations and also being supported. Again, the mentioned Design Guide provides a more detail explanation over the different deployment approaches. But as Summary Again, followings are the highlighted items:

  1. Non-Routed VSAN Network Between Two Physical Nodes
  2. Routed VSAN Network Between Witness VM and Two Physical Nodes
  3. Both Routed and Non-Routed management Network is Okay for All the VSAN Nodes
  4. MTU 9000 is NOT a must

While Point 1,3 and 4 are particularly trivial, Point 2 is more tricky one. Since VSAN Network does NOT support separate TCPIP stack yet which means VSAN Network does NOT have a separate default Gateway, in order to Enable VSAN Networks to be routable among the nodes, we need to setup Static Route on ESXi hosts to direct the VSAN Traffic.

esxcli network ip route ipv4/ipv6 add –gateway IPv4_address_of_router –network IPv4_address

Do remember the command could create persistent route on ESXi 5.1, 5.5 and 6.0 onward only. This would be fine for VSAN ROBO deployment which only supported after 6.0 Update 1, but if you would like to do implement Static Route on ESXi 5.0 hosts then you would need some extra step.

Deployment:

After preparing the necessary Networks and Host Configuration, VSAN deployment is more than trivial. There are step by step deployment guide from the Design Guide. Or for simplicity you can refer to the VMware Blog or the following Clip,

What I want to highlight is actually something after the deployment, you may see some error messages after configuring VSAN. Error or Alerts are usually raised as the VSAN Health Check Test will be run periodically by default. You can find more detail in the following tab.

vsan-health-retest

Usually for a ROBO deployment, you will receive Stretch Cluster Latency Alerts. This is a bug in the health check utility which do NOT have awareness on a ROBO deployment. While the Network Requirements VMware Supported are as following:

  1. Between VSAN Physical Nodes – 1G for Hybrid and 10G for All Flash, latency < 5ms RTT
  2. Between Witness Server to VSAN Nodes – 2Mbps and < 250ms RTT

But the Current Health Check Utilities puts a much tighter latency requirement for the Point 2 which causing a false alerts.

While if you are using some newer Hardwares as your Physical VSAN nodes, the DB installed with the vCenter 6.0 Update 2 which is released in March 2016 may not be update enough. If your vCenter is internet accessible, then you can simply update it on the fly. But or else, which is more likely that your vCenter is offline. You have to download an updated JSON file according to the VMware KB, and upload it manually. This could clear the HCL alerts you having.

So What’s Next, I would recommend to have a better monitoring in the VSAN Nodes. This is definitely necessary, as your ESXi hosts are not just running VM anymore, you also have your data which is definitely Valuable. There are many blogs and white paper telling you would to monitor the performance issues, but what I want to clarify and what I want to emphasise to the ESXi healthiness through log monitoring. And this can be easily done by using VMware Log Insight, however, this is too much for this blog, I would create another dedicated blog for VMware Log Insight and VMware Operation Manager.


vSphere Daily Environment Checking

vSphere is a renowned hypervisor platform and definitely this need not to be explained. VMware started some times ago (about 2 years) in building management stacks over the vSphere targeting to make sure the vSphere is being manageable and usable. Of course, the vRealize Operation Manager (vCenter Operation Manager) is a perfect tool for Capacity Planning, Holistic Monitoring and Reporting for your vSphere Environment. Following is the Home Page of the Operation Manager if you have never come across it.

screen-shot-2016-10-13-at-9-29-23-am-2

This is a separate licensed product which definitely be worthy. Yet this blog post is not focusing in this product, we are doing something more dirty. This is driven by one of my customer who would like to have a daily health check of their vSphere Environment, so instead of a comprehensive Monitoring solution what they asked for is a configuration and status dump everyday. Thus, with the help of VMware PowerCLI and RvTools, I’ve implemented sort of Daily Report for their environment. I believe this maybe useful for many of you, so let me share the steps on implementing it.

Let’s have a look in the result first, i actually provide the customer two dumps for each of the vCenter Servers they have in their environment. While one is configuration based and another is status based dump and they are being provided by:

  1. RvTools – Configuration Dump
  2. vCheck Reports – PowerCLI based custom report for Point of Time Status Dump

Pre-requisites:

We need one Windows Machine for running both Tools mentioned above. If you are using a Windows Based vCenter Server, you can actually use itself to configure this Health Check Mechanism. But if you have multiple vCenter Server Instances in the environment, a Dedicated Windows Server could be more useful. Followings are the two free tools we need:

RvTools from http://www.robware.net

screen-shot-2016-10-13-at-10-03-50-am-2

PowerCLI from VMware WebSite

screen-shot-2016-10-13-at-10-03-52-am-2

Under the hood, what I did is simple, I setup scripts for running RvTools Export and PowerCLI based vCheck Reports. And kicking off the scripts with Windows Scheduled Tasks.

Procedures:

After setting up the pre-requisites, we can start setting up the scripts and scheduled jobs as followings:

Configuration Dump Scripts

If you have use RvTools before, it’s a very simple but useful tools for dumping out the environment configurations. If you have used before, it can actually capture VM, Host, Virtual Switch, PortGroup, Datastore…etc objects. The configuration dumps are enough even in event if you need to rebuild anything from scratch again.

021111_0307_afreetoolr2

While the above screen captured the GUI output from the RvTools, but it can actually be exported into either CSV or Excel format. And the RvTool actually supports command line execution rather than just GUI driven, this is also the way I used to export the Configuration Dump Daily. And what you have to do is by copying the following script as a Batch file and Run this Batch file with the Windows Scheduled Tasks.

rem =====================================
rem Include robware/rvtools in searchpath
rem =====================================
set path=%path%;c:\program files (x86)\robware\rvtools

rem =========================
rem Set environment variables
rem =========================
set $VCServer=<The IP/Hostname of Your vCenter>
set $SMTPserver=<The IP/Hostname of Your Mail Server>
set $SMTPport=25
set $Mailto=<somebody@yourdomain>
set $Mailfrom=<somebody@yourdomain>
set $Mailsubject1=RvTools Daily Export
set $MYDATE=%DATE:~10,14%%DATE:~4,2%%DATE:~7,2%
set $AttachmentDir=C:\RvTools
set $AttachmentFile=<File Prefix>_%$MYDATE%.xls
rem ===================
rem Start RVTools batch
rem ===================
rvtools.exe -u <vCenter Username> -p <vCenter Password> -s %$VCServer% -c ExportAll2xls -d %$AttachmentDir% -f %$AttachmentFile%

rem =========
rem Send mail
rem =========
rvtoolssendmail.exe /smtpserver %$SMTPserver% /smtpport %$SMTPport% /mailto %$Mailto% /mailfrom %$Mailfrom% /mailsubject %$Mailsubject% /attachment %$AttachmentDir%\%$AttachmentFile%

rem =====================
rem Move Files to Archive
rem =====================
move %$AttachmentDir%\%$AttachmentFile% %$AttachmentDir%\Archive

The script above will generate Excel based Export and also send it out as an attachment to the target recipient. On setting up Corresponding Windows Schedule Task, we can do a daily reporting. Followings are the sample setup.

screen-shot-2016-10-13-at-3-17-04-pm-2

screen-shot-2016-10-13-at-3-17-06-pm-2

screen-shot-2016-10-13-at-3-17-08-pm-2

Environment Status Dump Scripts

Besides configuration dump, more than often, many customer would also like to have a status dump on reporting stuffs like Snapshot Usage, Host Usage Status, NTP Status… etc. These information are not retrievable from the RvTools as mentioned above. And not to mention, we actually cannot have a very nice report formatted export from the RvTools. This is why besides the Configuration Dump above, I also created another Scheduled Task for running a PowerCLI based Script Report Daily. Following is a sample output of the vCheck Report we leveraged. It is developed and shared by VMware at this link.

687474703a2f2f7777772e76697274752d616c2e6e65742f77702d636f6e74656e742f75706c6f6164732f323031342f30322f76436865636b3631392e6a7067

Actually it is a bunch of Scripts developed in PowerCLI to export a lot informations including the followings:

  • General Details
  • Number of Hosts
  • Number of VMs
  • Number of Templates
  • Number of Clusters
  • Number of Datastores
  • Number of Active VMs
  • Number of Inactive VMs
  • Number of DRS Migrations for the last days
  • Snapshots over x Days old
  • Datastores with less than x% free space
  • VMs created over the last x days
  • VMs removed over the last x days
  • VMs with No Tools
  • VMs with CD-Roms connected
  • VMs with Floppy Drives Connected
  • VMs with CPU ready over x%
  • VMs with over x amount of vCPUs
  • List of DRS Migrations
  • Hosts in Maintenance Mode
  • Hosts in disconnected state
  • NTP Server check for a given NTP Name
  • NTP Service check
  • vmkernel warning messages ov the last x days
  • VC Error Events over the last x days
  • VC Windows Event Log Errors for the last x days with VMware in the details
  • VC VMware Service details
  • VMs stored on datastores attached to only one host
  • VM active alerts
  • Cluster Active Alerts
  • If HA Cluster is set to use host datastore for swapfile, check the host has a swapfile location set
  • Host active Alerts
  • Dead SCSI Luns
  • VMs with over x amount of vCPUs
  • vSphere check: Slot Sizes
  • vSphere check: Outdated VM Hardware (Less than V7)
  • VMs in Inconsistent folders (the name of the folder is not the same as the name)
  • VMs with high CPU usage
  • Guest disk size check
  • Host over committing memory check
  • VM Swap and Ballooning
  • ESXi hosts without Lockdown enabled
  • ESXi hosts with unsupported mode enabled
  • General Capacity information based on CPU/MEM usage of the VMs
  • vSwitch free ports
  • Disk over commit check
  • Host configuration issues
  • VCB Garbage (left snapshots)
  • HA VM restarts and resets
  • Inaccessible VMs
  • Much, Much more…….

More than often, the above covers most usual status healthiness problem of an environment. But you can always add more Scripts on top of the vCheck Report to enrich the content of it. This is something I have done for my customer. The vCheck Report also supports sending out as attachment in Mail. But I edited it to export and store locally in Parallel.

Yet, instead of showing how to edit the PowerCLI scripts. I think it’s more important to show how the scheduled tasks are being created as we need some tweaks to Run PowerCLI script from Windows Scheduler. To achieve this, we would need to create a Batch File which is runnable by the Scheduled Tasks while the Batch File help kicking off the PowerCLI task. Following please see the one line batch file you need to prepare.

powershell.exe -ExecutionPolicy Unrestricted -NonInteractive -NoProfile -File C:\vCheck\vCheck-vSphere-master\vCheck.ps1

This Batch Script will Call the vCheck.ps1 which run and export the vCheck Report. And the Scheduled task in the Windows can be configured as following:

screen-shot-2016-10-13-at-3-36-09-pm-2

So, with these two tasks being setup, everyday you can have both your configuration dump and environment status dump on hand. Such that, it’s far easier for you to manage the environment when your vSphere keeps going. It also let you have a kind of daily backup of your environment in case incident outbreak and something has to be rebuild from scratched. Again, this is definitely not a comprehensive monitoring solution and reporting tools. But this gives some sort of ease in management if you don’t have a vRealize Operation Manager yet in the environment.


What’s Next?

I’ve been working in the IT field for years. “What’s Next?” is always a question I would ask myself, colleagues or customers. I think this is critically important for one to know what is the next step or few more next steps afterwards you would like to take.

For myself, I actually had a blog before this one, yet I didn’t continue blogging as I was not blogging with a Goal or a Mission. But on the previous failure, I would like to reboot the blog with a whole new theme. And “What’s Next?” would be the main theme I would like to share in this blog.

next