WOPR12 was held in Indianapolis, Indiana, USA on April 16-18, 2009, and was hosted by Mobius Labs. Dan Downing was the Content Owner.
Henri Amistadi, Charlie Audritsh, Marcel Butz, Ross Collard, Dan Downing, Andy Hohenner, Paul Holland, Karen Johnson, Mike Kelly, Elena Lopez, Jeff Pickett, Alex Podelko, Eric Proegler, Raymond Rivest, Roland Stens
Theme: Resource Monitoring During Performance Testing
People involved in developing, testing, and delivering hardware, software, or internet based applications must be able to ensure those solutions meet customer and user expectation. But what are those expectations – specifically around resolving performance bottlenecks – and how do you ensure they are being met?
WOPR12 will explore the topic of resource monitoring with seasoned professionals, including architects, designers and performance and reliability testers.
The ultimate goal of performance testing is identifying and fixing performance bottlenecks before production deployment. While a large percentage of bottlenecks are ultimately traced to the application code, they may also occur in servers, system software, and network components. Critical to identifying their source and diagnosing their root cause is monitoring the “bellwether metrics”. Let us define “bellwether metrics” as the “key indicators of future developments or trends”, and thus the minimum set of system vitals that we should be measuring for each hardware and software component.
Some commonly used metrics for hardware components are: CPU utilization, available memory, and memory paging rate. It is common to be concerned with I/O subsystem activity, bandwidth throughput of network interfaces, or a network’s hop latencies? Other bellwether metrics are associated with software components, such as a web server’s error rates, an app server’s JVM heap space utilization, a database server’s deadlocks and lock timeouts, or a load balancer’s load-balancing effectiveness.
A close look at Windows’ Perfmon, Unix’s vmstat, a java app server’s system console, or Oracle’s Statspack reveals hundreds of such points that we could chose to log during a load test. Choose too many, and you’ll drown in too much information. Choose too few, and you’re flying blind, or you may miss a vital piece of diagnostics information from an expensive and time-consuming test. Hence, which are the bellwether metrics for each component?
Then come three deeper questions: What tools can we use to collect these metrics, what are their “healthy” thresholds, and how do we interpret their impact on a system’s scalability and capacity? Choosing knowledgeably – and interpreting the results correctly — often requires close collaboration with DBAs, network engineers, app server specialists, system administrators, and developers.
For this WOPR we are interested in your experiences of which bellwether metrics you’ve found useful – or not useful – in diagnosing performance bottlenecks. We are looking for specific information you can contribute towards compiling a glossary by system component, with definitions, tools used to measure them, sample graphs and outputs, healthy thresholds, and guidelines for interpretation.
We are interested in practical, first-person experience reports that address one or more of these questions:
1.In a given system context, how did you identify and prioritize the bellwether metrics to measure?
2.What hardware / software vitals did you measure, how did you measure them, how did you interpret the measurements, and how did these results help you identify a bottleneck?
3.Based on the location of your load testing system vs. the system-under-test, what open firewall ports, system privileges, operating system utilities, or installed agents did you need in order to be able to capture these measurements?
4.What top-down or cascading relationships between macro (higher level) and micro (lower level) measurements have you found that enable you to trace a bottleneck root cause?
5.What open source or commercial tools have you found useful, easy to implement and cost-effective for measuring key vitals?
6.What sampling intervals and statistical aggregations have you found most meaningful?
7.What graphs and supporting data tables have you used that clearly convey bottlenecks to stakeholders?
8.What lists of points by system tier have you developed to guide your instrumentation from project to project?
At the outset of this WOPR, our goal is to digest the group’s collective experience and publish a compendium of resource measurements that will contribute to the maturation of performance testing. Any documented inputs toward this goal provided as part of your abstract / outline will enhance your chance of being selected.
Private workshop collaboration space here.