Performance Engineering and Analysis for Ultra-Large-Scale Software Systems

The rise of Ultra-Large-Scale (ULS) software systems (e.g., Amazon.com, GMail and AT &T’s telecommunication infrastructure) poses new challenges for the software engineering field. ULS systems require near-perfect up-time and support millions of concurrent connections and operations. Failures in such systems are typically associated with performance issues, rather than with feature bugs. Therefore, load testing has become essential in ensuring the problem-free operation of such systems. The goal of such testing is to examine how the system behaves under realistic workloads to ensure that the system performs well in the field. However, ensuring that load tests are ‘realistic’ (i.e., that they accurately reflect the current field workloads) and that the system is free of performance issues is a major challenge. Field workloads are based on the behaviour of thousands or millions of users interacting with the system. These workloads continuously evolve as the user base changes, as features are activated or disabled and as user feature preferences change. Such varying field workloads often lead to load tests that are not reflective of the field, yet these workloads have a major impact on the system’s performance. This has led to the emergence of ‘continuous load testing,’ where load test cases are continuously updated and the systems performance under load re-validated even after the system’s deployment.

To support the continuous load testing process, we are currently working to propose, implement and evaluate new approaches for the automated validation of load tests and load test suites using data from multiple sources including execution logs and performance counters. Our validation has focused on 1) ensuring that load test suites accurately reflect the current field workloads and 2) detecting and diagnosing performance issues.

 
Set your Twitter account name in your settings to use the TwitterBar Section.