Replication Data

A Large-Scale Empirical Study of the Relationship Between Build Technology and Build Maintenance

Shane McIntosh, Meiyappan Nagappan, Bram Adams, Audris Mockus, and Ahmed E. Hassan
Submitted to Empirical Software Engineering: An International Journal (EMSE).


Build systems specify how source code is translated into deliverables. They require continual maintenance as the system they build evolves. This build maintenance can become so burdensome that projects switch build technologies, rewriting potentially thousands of lines of build code. We aim to understand the relationship between build technology and build maintenance by analyzing version histories in a corpus of 843,976 repositories spread across four software forges, three software ecosystems, and four large-scale projects. We study low-level, abstraction-based, and framework-driven build technologies, as well as tools that automatically manage external dependencies. We find that modern, framework-driven technologies are associated with more churn and a tighter coupling with source code than low-level and abstraction-based ones. Technology migrations tend to reduce source-build coupling and shift build maintenance work to a build-focused team. Our findings raise important questions for research and practice (e.g., why are modern build technologies associated with more maintenance?) and provide an approach that we expect to help answer them.


(Under submission)

Data and Scripts