The Blue Ocean of Performance Monitoring

One of the areas that is of keen interest to many of the organizations I speak with is how to monitor performance accurately within a VM, as compared to a physical counterpart.

There are the obvious answers of taking what the hypervisor sees and using that for your metrics collection.  While that certainly works, it’s not a true representation of exactly what is happening within the VM.  This is especially true of wanting to monitor specific threads or transactions.  Since the hypervisor is not looking within the guest to see specific information, you get the broad overview and the data is accurate since it is based on what the hypervisor is actually granting resource access to.

Then you have to fall back on the traditional ways of monitoring performance within a guest, physical or virtual, by employing monitoring things like proc nodes, top, perfmon, wmi or some other agent based collection tool.  While these tend to give a far more granular detail on what is happening within the guest OS, they clearly lack understanding in a virtual world of what is happening outside of that.  Which can then lead for those tools to be inaccurate.

So what options are there for you to accurately monitor VMs, but still get granular detail on guest OS based metrics?  The answer today is unfortunately that you have to do the correlation between the metrics to derive the true values all on your own.

There have been a lot of companies starting to gather data from both, but still there is no view or automated way of correlating this data.  The hope is that these organizations can start to compound the data together so we can get more accurate representations of of what is actually happening inside our VMs.  Also I would like to see standardization on how this is done so that regardless of the tool or company that the data remains consistent across the industry.

Next up will be a look at different free tools that can help you better ascertain the performance.  I will be looking at inside of a guest OS and outside from the hypervisor so you can correlate the data in a more automated fashion.

2 comments

2 pings

    • Albert Widjaja on April 28, 2009 at 1:30 am

    Very great explanation indeed.

    Looking forward for the freeware for monitoring this VMs.

    Cheers,
    AWT

    • VMdoug on May 6, 2009 at 5:41 pm

    Guest/application performance correlation is indeed a difficult nut to crack. If VMware tools provided more insight to what was happening within the guest that would help but could we expect vCenter to store that data? It already stores enough with the metrics it collects.

    There are a number of EMS solutions available in the physical world, many of these are very mature and have been fine tuned over the years but still lack the visibility into the virtual layer. One suggestion would be take an established EMS system and extend it with the data from the virtual layer, correlating it all back into that EMS.

    This type solution provides great visibility for the operations team (already accustomed to their EMS solution) without them having to learn much about virtualization. The only downfall here is for the team managing the virtual infrastructure (and not the applications), they may be reluctant to adopt the chosen EMS since it’s “unknown” to them and complicated, they may be looking for a more focused solution just on the virtualization layer.

Comments have been disabled.

%d bloggers like this: