Self-Healing Datacenter Application Monitoring

Written By Nathan Bennett

These days there are a lot of different application monitoring solutions. I know this because I’m in the middle of looking through them for a good fit. The hardest thing for an IT operations engineer is having to be woken up at 3 AM to fix a “Server Down” situation. However, it’s even more detrimental for a company to have a customer or a critical application to be down and no one be alerted.

Application troubleshooting.png

I always think about the old adage, “If a tree falls in the woods but no one hears it, does it actually fall?” Well of course it does. The difference is no one knows or cares about said tree. Well if that tree is a tier 1 app. You better know whats going on.

One possible solution that starts from vSphere to the cloud is vRealize Operations. This solution starts up in your private cloud, pulling all resource solutions attached. Out of the box this will give you recommendations on resources, and adjustments for the environment to run smoother. It will also automatically pull errors and issues from VMware and publish it as errors to be fixed. Additionally, it can have a notification solution for multiple sources. This all runs natively in VMware however, with new iterations of vRealize Operations new capabilities have been brought in. From the private cloud to now the public cloud. You can see the resources within the public cloud and manage those resources as needed.


One part of the solution that a lot of users don’t know or utilize is custom application solution. This is a simple connection to a package manager repository. Once completed you will deploy a remote collector to start pulling the OS solution that is deployed. This includes an agent that will run on the OS and can be customized to check against error states. The fun fact that is then added to that solution is the automation of another vRealize solution which is vRealize Automation. vRealize Automation is another addition to the vRealize Operations tool belt that engages automation as a solution. So on top of your operations on infrastructure, and now application layer, you can add self-healing through vRealize Automation, and Orchestration to fix those solutions.

This all gives a new word that we all love, to our environment, which is self-healing. By setting error states in Operations, and automation in vRealize Automation/Orchestration, you now have the solution that when an error is created, the automation to fix said error, runs. On top of all of these solutions, you have the power of the individual solutions. So you have the self-service automation of vRealize Automation/Orchestration, and the operations management of vRealize Operations.

Now with these solutions all built-in you can run this self-healing for your environment, and add additional composition to your infrastructure.