Week 1: Course Overview
|
|
Week 2: Introduction to ultra-large-scale system
|
|
Week 3: Log Analysis
|
Automatic Identification of Load Testing Problems
Zhen Ming Jiang, Ahmed E. Hassan, Parminder Flora, and Gilbert Hamann
|
|
Detecting Large-Scale System Problems by Mining Console Logs
Wei Xu, Ling Huang, Armando Fox, David Patterson, Michael Jordan
|
|
Analyzing Log Analysis: An Empirical Study of User Log Mining
S. Alspaugh, Beidi Chen and Jessica Lin; Archana Ganapathi, Marti A. Hearst and Randy Katz
|
|
Leveraging Existing Instrumentation to Automatically Infer Invariant-Constrained Models
Ivan Beschastnikh, Yuriy Brun, Sigurd Schneider, Michael Sloan, Michael D. Ernst
|
[READING]
|
Mining Invariants from Console Logs for System Problem
Jian-Guang LOU, Qiang FU, Shengqi YANG, Ye XU, and Jiang LI
|
[READING]
|
Characterizing Logging Practices in Open-Source Software
Ding Yuan, Soyeon Park, and Yuanyuan Zhou
|
[READING]
|
Where Do Developers Log? An Empirical Study on Logging Practices in Industry
Qiang Fu, Jieming Zhu, Wenlu Hu, Jian-Guang Lou, Rui Ding, Qingwei Lin, Dongmei Zhang, and Tao Xie
|
[READING]
|
Improving Software Diagnosability via Log Enhancement
Ding Yuan, Jing Zheng, Soyeon Park, Yuanyuan Zhou, and Stefan Savage.
|
[READING]
|
|
Week 4: Performance Counters and Measurements
|
Correlating instrumentation data to system states: a building block for automated diagnosis and control
Ira Cohen, Moises Goldszmidt, Terence Kelly, Julie Symons, Jeffrey S. Chase
|
|
Producing Wrong Data Without Doing Anything Obviously Wrong!
Todd Mytkowicz,Amer Diwan, Matthias Hauswirth, Peter F. Sweeney |
|
Automatic Detection of Performance Deviations during Load Testing of Large Scale Systems
Haroon Malik, Hadi Hemmati, Ahmed E. Hassan |
|
The Mystery Machine: End-to-end Performance Analysis of Large-scale Internet Services
Michael Chow, David Meisner, Jason Flinn, Daniel Peek, Thomas F. Wenisch
|
[READING]
|
Statistical Debugging for Performance Problems
Linhai Song Shan Lu |
[READING] |
Catch Me if You Can: Performance Bug Detection in the Wild
Milan Jovic, Andrea Adamoli, Matthias Hauswirth
|
[READING]
|
X-ray: Automating Root-Cause Diagnosis of Performance Anomalies in Production Software
Mona Attariyan, Michael Chow and Jason Flinn |
[READING]
|
|
Week 5: System monitoring
|
Assignment update (10 min presentation)
|
Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining
Robbert Van Renesse , Kenneth P. Birman , Werner Vogels
|
|
Detecting failures in distributed systems with the FALCON spy network
Joshua B. Leners, Hao Wu, Wei-Lun Hung, Marcos K. Aguilera, Michael Walfish |
|
Understanding the behavior of database operations under program control
Juan M. Tamayo, Alex Aiken, Nathan Bronson, Mooly Sagiv |
|
AjaxScope: A Platform for Remotely Monitoring the Client-side Behavior of Web 2.0 Applications
Emre Kiciman and Benjamin Livshits
|
[READING]
|
Lightweight, High-Resolution Monitoring for Troubleshooting Production Systems
Sapan Bhatia, Abhishek Kumar, Marc E. Fiuczynski and Larry Peterson |
[READING]
|
|
Week 6: Assignment presentation
|
Assignment DUE -- (30 mins presentation + 10 page IEEE report) $
|
|
Week 7: Configuration
|
Do Not Blame Users for Misconfigurations
Tianyin Xu, Jiaqi Zhang, Peng Huang, Jing Zheng, Tianwei Sheng, Ding Yuan, Yuanyuan Zhou, Shankar Pasupathy
|
|
Automated Diagnosis of Software Configuration Errors
Sai Zhang and Michael D. Ernst. |
|
An Empirical Study on Configuration Errors in Commercial and Open Source Systems
Zuoning Yin, Xiao Ma, Jing Zheng, Yuanyuan Zhou, Lakshmi N. Bairavasundaram, Shankar Pasupathy
|
|
AutoBash: Improving Configuration Management with Operating System Causality Analysis
Ya-Yunn Su, Mona Attariyan, and Jason Flinn
|
[READING]
|
|
|
Project Proposal DUE (2 pages IEEE format)
|
Week 8: Project Proposal Presentations
|
Project Proposal Presentation (15 mins + 10 mins questions)
|
|
Week 9: Debugging ultra-large-scale systems
|
Debugging in the (Very) Large: Ten Years of Implementation and Experience
Kirk Glerum, Kinshuman Kinshumann, Steve Greenberg, Gabriel Aul, Vince Orgovan, Greg Nichols, David Grant, Gretchen Loihle, and Galen Hunt
|
|
Extrinsic Influence Factors in Software Reliability:
A Study of 200,000 Windows Machines
Christian Bird, Venkatesh-Prasad Ranganath, Thomas Zimmermann,
Nachiappan Nagappan, Andreas Zeller |
|
Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-Intensive Systems
Ding Yuan, Yu Luo, Xin Zhuang, Guilherme Renna Rodrigues, Xu Zhao, Yongle Zhang, Pranay U. Jain, and Michael Stumm
|
|
Performance Debugging in the Large via Mining Millions of Stack Traces
Shi Han, Yingnong Dang, Song Ge, Dongmei Zhang, and Tao Xie |
[READING]
|
|
Week 10: Power
|
Refactoring android Java code for on-demand computation offloading
Ying Zhang, Gang Huang, Xuanzhe Liu, Wei Zhang, Hong Mei, Shunxiang Yang |
|
Carat: Collaborative Energy Diagnosis for Mobile Devices
Adam J. Oliner, Anand P. Iyer, Ion Stoica, Eemil Lagerspetz, Sasu Tarkoma |
|
Evaluating the Effectiveness of Model-Based Power Characterization
John C. McCullough and Yuvraj Agarwal, Jaideep Chandrashekar, Sathyanarayan Kuppuswamy, Alex C. Snoeren, and Rajesh K. Gupta
|
|
Green mining: A methodology of relating software change to power consumption
Abram Hindle
|
[READING]
|
|
Week 11: Project Presentations
|
Project Presentation DUE (20 mins presentation)
|
|
Project Report DUE (10 page IEEE report)
|
|