Week 1: Course Overview (Jan 4)




Week 2: Introduction to ultra-large-scale system (Jan 11)

Guest lecture from SAIL researchers: Tse-Hsun (Peter) Chen and Mark Syer
Automated Root Cause Isolation of Performance Regressions during Software Development
Christoph Heger, Jens Happe, Roozbeh Farahbod
[ASSIGNMENT]
Week 3: Log Analysis (Jan 18)
Automatic Identification of Load Testing Problems
Zhen Ming Jiang, Ahmed E. Hassan, Parminder Flora, and Gilbert Hamann
Detecting Large-Scale System Problems by Mining Console Logs
Wei Xu, Ling Huang, Armando Fox, David Patterson, Michael Jordan
Analyzing Log Analysis: An Empirical Study of User Log Mining
S. Alspaugh, Beidi Chen and Jessica Lin; Archana Ganapathi, Marti A. Hearst and Randy Katz
Characterizing Logging Practices in Open-Source Software
Ding Yuan, Soyeon Park, and Yuanyuan Zhou
[READING]
Where Do Developers Log? An Empirical Study on Logging Practices in Industry
Qiang Fu, Jieming Zhu, Wenlu Hu, Jian-Guang Lou, Rui Ding, Qingwei Lin, Dongmei Zhang, and Tao Xie
[READING]
Improving Software Diagnosability via Log Enhancement
Ding Yuan, Jing Zheng, Soyeon Park, Yuanyuan Zhou, and Stefan Savage.
[READING]
Week 4: Performance Counters and Measurements (Jan 25)
Correlating instrumentation data to system states: a building block for automated diagnosis and control
Ira Cohen, Moises Goldszmidt, Terence Kelly, Julie Symons, Jeffrey S. Chase
Subsuming Methods: Finding New Optimisation Opportunities in Object-Oriented Software
David Maplesden, Ewan Tempero, John Hosking, John C. Grundy
Leveraging Performance Counters and Execution Logs to Diagnose Memory-Related Performance Issues
Mark D. Syer, Zhen Ming Jiang, Meiyappan Nagappan, Ahmed E. Hassan, Mohamed Nasser and Parminder Flora
Statistical Debugging for Performance Problems
Linhai Song Shan Lu
[READING]
Catch Me if You Can: Performance Bug Detection in the Wild
Milan Jovic, Andrea Adamoli, Matthias Hauswirth
[READING]
Week 5: System monitoring (Feb 1)
Assignment update (10 min presentation)
Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining
Robbert Van Renesse , Kenneth P. Birman , Werner Vogels
Detecting failures in distributed systems with the FALCON spy network
Joshua B. Leners, Hao Wu, Wei-Lun Hung, Marcos K. Aguilera, Michael Walfish
AjaxScope: A Platform for Remotely Monitoring the Client-side Behavior of Web 2.0 Applications
Emre Kiciman and Benjamin Livshits
Lightweight, High-Resolution Monitoring for Troubleshooting Production Systems
Sapan Bhatia, Abhishek Kumar, Marc E. Fiuczynski and Larry Peterson
[READING]
Week 6: Assignment presentation (Feb 8)
Assignment DUE -- (30 mins presentation + 10 page IEEE report)
Week 7: Configuration (Feb 19)
Optimizing the Performance-Related Configurations of Object-Relational Mapping Frameworks Using a Multi-Objective Genetic Algorithm
Ravjot Singh, Cor-Paul Bezemer, Weiyi Shang and Ahmed E. Hassan
Automated Diagnosis of Software Configuration Errors
Sai Zhang and Michael D. Ernst.
An Empirical Study on Configuration Errors in Commercial and Open Source Systems
Zuoning Yin, Xiao Ma, Jing Zheng, Yuanyuan Zhou, Lakshmi N. Bairavasundaram, Shankar Pasupathy
AutoBash: Improving Configuration Management with Operating System Causality Analysis
Ya-Yunn Su, Mona Attariyan, and Jason Flinn
[READING]
Project Proposal DUE (2 pages IEEE format)
Week 8: Project Proposal Presentations (Feb 22)
Project Proposal Presentation (15 mins + 10 mins questions)
Week 9: Debugging ultra-large-scale systems (Feb 29)
Debugging in the (Very) Large: Ten Years of Implementation and Experience
Kirk Glerum, Kinshuman Kinshumann, Steve Greenberg, Gabriel Aul, Vince Orgovan, Greg Nichols, David Grant, Gretchen Loihle, and Galen Hunt
Extrinsic Influence Factors in Software Reliability: A Study of 200,000 Windows Machines
Christian Bird, Venkatesh-Prasad Ranganath, Thomas Zimmermann, Nachiappan Nagappan, Andreas Zeller
Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-Intensive Systems
Ding Yuan, Yu Luo, Xin Zhuang, Guilherme Renna Rodrigues, Xu Zhao, Yongle Zhang, Pranay U. Jain, and Michael Stumm
Performance Debugging in the Large via Mining Millions of Stack Traces
Shi Han, Yingnong Dang, Song Ge, Dongmei Zhang, and Tao Xie
[READING]
Week 10: Power (Mar 7)
Refactoring android Java code for on-demand computation offloading
Ying Zhang, Gang Huang, Xuanzhe Liu, Wei Zhang, Hong Mei, Shunxiang Yang
Carat: Collaborative Energy Diagnosis for Mobile Devices
Adam J. Oliner, Anand P. Iyer, Ion Stoica, Eemil Lagerspetz, Sasu Tarkoma
Evaluating the Effectiveness of Model-Based Power Characterization
John C. McCullough and Yuvraj Agarwal, Jaideep Chandrashekar, Sathyanarayan Kuppuswamy, Alex C. Snoeren, and Rajesh K. Gupta
Green mining: A methodology of relating software change to power consumption
Abram Hindle
[READING]
Week 11: Project Presentations (Mar 28)
Project Presentation DUE (20 mins presentation)


Project Report DUE (10 page IEEE report) (Apr 16)