Wednesday, June 29, 2016

VMware vROps - EPOps Multi-vCenter Monitoring Dashboard

Since the EPOps vCenter Monitoring Dashboard is one of the most popular posts on the blog, I thought it would be a good idea to update this highly demanded dashboard to work with multiple vCenters. A lot of sizable enterprises out there have multiple vCenter instances with thousands of hosts and tens of thousands of VMs. Also, with the introduction of vSphere 6, more customers have begun their migration journey away from Windows-based vCenter servers to vCenter Server Appliance (VCSA) and everything in-between, including a mix of both solutions. Since vROps enables you to have a unified single pane of glass view across your entire environment regardless of locations or versions, you can leverage it to create a dashboard to keep an eye on all of your vCenter servers. This can include both Windows and VCSA versions of vCenter, as well as SSO and PSC if they have been separated out. Heck, you can even include the MS SQL server hosting the vCenter database if you want.

In this post, we will look at one example of a dashboard that can be used to keep an eye on a vCenter server population. But first let's look at some of the prerequisites that have to be met before we start working on the dashboard:
  1. Adequate vRealize Operations Manager Advanced or Enterprise Licensing
  2. Install the latest version of EPOps Agent on your vCenter servers. I will not regurgitate the same instructions you can find in the official documentation or in other blogs as it does not create any new value for the community.
    1. For Windows-based vCenter, follow these steps.
    2. For VCSA, follow these steps (unfortunately installing EPOps Agent on VCSA is not supported by VMware's Global Support organization, so do this at your own risk or in a demo/lab environment only).
    3. A number of other bloggers have also covered various agent installation options with detailed screenshots, so happy googling.
  3. If you have a distributed vCenter instance where you separated and installed various vCenter components on different servers, you will have to install the EPOps Agent on all of the servers that are part of the solution. For example, if in a Windows based vCenter deployment you installed vCenter Service, Web Client, Inventory Service, and SSO all on different servers then you need to install the agent on each of those servers to have complete visibility into the solution. Same goes for VCSA if you deployed a separate Virtual Appliance (VA) for PSC.
  4. Follow these steps to configure the EPOps agent after installation to point at your Load Balanced vROps Analytics Cluster, or better yet the Remote Collector Group (this may be a good topic for another post).
  5. Install the vCenter Plug-in for vROps. This will enable vROps to understand vCenter application stack. Download vCenter Self-Monitoring Solution for vRealize Operations Manager.
  6. Optionally, if you want to monitor the MS SQL server hosting your vCenter database, you need to install EPOps Agent on that server as well.
  7. If you went through the trouble of installing the EPOps Agent on your MS SQL server, then you also need to install the MS SQL plugin in vROps. This will enable vROps to understand MS SQL schema. Download MS SQL Plug-in.
  8. Same goes for the embedded PostgresSQL on VCSA. Download and install PostgresSQL Plug-in.
  9. After the plug-ins and agents have been installed, configured and registered with vROps, some adapter types require credentials in order to collect data. You can find those additional configuration details for vCenter App Server, MS SQL and PostgresSQL in the official vCenter Solution Guide or in this VMTN post by Thomas Baublys.
Now that we have met all of the prerequisites and vROps is collecting vCenter metrics from all of the EPOps Agents, let's move on to the actual dashboard design and layout. As you may have already noticed, this new vCenter dashboard is very different from the old one. The main difference is in the number of vCenter objects we're trying to display. This forces us to address several challenges and therefore dictates a different layout and set of widgets used to solve the use case.



Starting at the top of the first column (upper left corner of the content pane) is an Object List widget (1) configured to display all of the vCenter VM objects. As stated before, this includes both Windows and Linux instances as well as SSO and PSC if they are present and running on dedicated servers. This requires creation of a Custom Group that includes all of those VMs in order to make it selectable in the widget filter. To make it more useful, I also included several vCPU KPIs in additional columns, which enables us to see any performance issues at a glance as well as sort. The additional columns include metrics such as Health, number of vCPUs, CPU Contention, CPU Demand, CPU Usage, CPU Ready, Memory Contention, and Snapshot Size. The initial object list widget is set to Auto Select First Row making the first VM object selection for us. This automatically loads the remaining widgets on the dashboard and provides for more dynamic user experience.

Next, in the middle column, a Sparkline Chart widget (2) shows a lot more details than the object list. The sparkline goes on to provide a list of important VM KPIs in four food groups including: CPU, Memory, Disk, and Network. The metrics in the sparkline chart are preset using an XML interaction file which will be covered in future posts. The sparkline chart also supports adjusting the time range, enabling us to change how far back back we want to look and spot trends over time.

The third column has a Top Alerts widget (3) showing the selected vCenter VM and descendant objects alerts. Also, right below top alerts is a Heatmap widget (4) displaying all disks and their usage % based on the VM Tools running in the VM. This way we can sweep from left to right and get a very clear picture of a vCenter VM status.

Moving on, the second row of widgets pertains to the Windows or Linux Guest OS running in the VM and, from this point on, all of the remaining dashboard data is provided by the EPOps Agents running in the VMs. Again, starting from the left is an Object List widget (5) showing the Guest OS. Of course the widget is using Auto Select First Row to select the OS instance and load the remaining widgets. Additional columns show Health, Availability, and collection status.

Going with the flow in the middle column we have the star of the show - a Sparkline Chart widget (6) showing Guest OS KPIs reported by the EPOps Agent. Of particular interest is the Memory Used % metric as it reports the actual memory usage by all applications as seen by the OS and not as reported by vSphere Host. This is a much more accurate measurement than what you're used to seeing in vCenter Performance Charts or even naked vROps without EPOps Agent. In the right side column we again have the Top Alerts widget (7). This time it's configured to show us the Guest OS alerts.

Advancing to the third row is an Object List widget (8) showing various application services reported by the EPOps Agent. Since we're dealing with a VCSA, you should see some familiar items here such as PostgresSQL, Inventory Service, and and so on. Additional columns provide health, availability, and collection status.

As usual, the center column has a Sparkline Chart widget (9) listing KPIs for the selected item in the object list. Remember that this interaction is completely dynamic, so the KPIs will change depending on which service you select in the object list. The list of KPI is provided via the  XML interaction file. In the example screenshot, I selected the PostgresSQL to show Database Management System (DBMS) relevant metrics and in order to transition the next row of widgets.

We finally reached the last row of object list, sparkline chart, and top alerts widgets (11-13). Depending on what you selected in the services or the previous row Object list widget (8) this last row may be completely blank. This is because certain services don't have any subcomponents. However, in our example I purposefully selected PostgresSQL in the previous row so we would get a listing of databases hosted by the DBMS engine. I also selected the "Database VCDB" object so we would get some useful metrics in the sparkline chart. Database VCDB is where vCenter data is stored.

As you can see, this dashboard heavily relies on the Object List widget and the underlying relationship among objects we're interested in interrogating. I have previously explained the importance of vROps relationship in the Applications VMs Performance Dashboard post in case you missed it. Additionally, you may have noticed that a lot of my dashboard designs look fairly similar lately. Starting off with an object list, following up with a sparkline chart, and ending with top alerts. All object lists feed off the previous one, creating a completely dynamic experience. I think this is the most effective layout and it is well suited to multiple use cases based on a number of different relationships. Obviously this is not perfect by any means, but considering the limitations, it seems to provide the most bang for the buck in the available screen real estate. I agree it may look a little busy, but by providing a lot of KPIs, we can minimize the navigation and searching which tends to be aggravating and time consuming. I will get deeper into the actual dashboard design and layout topics in an upcoming post of the existing two part series in near future. 

In conclusion, the multi-vCenter dashboard provides a more convenient way of keeping an eye on your management infrastructure. Stay tuned for more VMware SDDC stack dashboards.

For more information about vROps, see the following resources:

vROps Extensibility Options:
You can extend vROps functionality by installing additional management packs:
VMware Management Packs include options for vRA, NSX, vRO, Log Insight, AWS, etc.
Blue Medora Management Packs include options for NetApp, Oracle, SAP, UCS, Citrix, etc.

Books:
VMware Performance and Capacity Management - Second Edition by Iwan 'e1' Rahabok
VMware vRealize Operations Managers Essentials by Matthew Steiner
Mastering vRealize Operations Manager by Scott Norris
VMware vRealize Operations Manager Capacity and Performance Management by Iwan 'e1' Rahabok

Official VMware:

VMware Professional Services
Official vROps Documentation
VMware Operations Management White Papers
vROps product page

Blogs:
VMignite by Lan Nguyen
vXpress by @Sunny_Dua 
virtual red dot by @e1_ang
Virtualise Me by @auScottNorris
Elastic Sky Labs by @JAGaudreau
i'm all vIRTUAL by @LiorKamrat

6 comments:

  1. This Information was so helpful! Exactly what I needed, I truly appreciate the step by step presentation. Thank you for sharing this valuable article. Dedicated Servers

    ReplyDelete
  2. This dashboard looks great but pretty complicated to replicate manually. Can you post an export of the dashboard configuration?

    ReplyDelete
  3. Can you provide some details on the Object list for Widget 1, I have it as a custom group of Virtual Machines, but when I set the relationship to feed Widget 5 it always come blank. I cannot figure out how the object relationship works so after selecting a Virtual Machine Object widget 5 gets populated with the respective EPOps Agent information. I have tried different object types but none of them seem to be right.

    Thanks.

    ReplyDelete
    Replies
    1. For Object List widget 1 you need to create a Custom Group that has all of your vCenters VMs in it. Then in the widget filter select that Custom Group. This will only display the vCenter VMs you have in your environment.
      Set the Object List widget 5 to Child mode. If you have the EPOps Agent installed on your vCenter and the relationship was created between the VM and the Guest OS then that widget will display the OS object.
      Same thing for widgets 8 and 11.
      Good luck!

      Delete