Workshop Program

Workshop Program

TimeEvent
1:15PM to 1:30PMWorkshop Introduction (Organizers)
1:30PM to 2:30PMKeynote by Rocco Oliveto and Discussion
2:30PM to 2:45PMKeynote Discussion Session
2:45PM to 3:15PMBREAK
3:15PM to 3:30PMDaqing Hou and Lingfeng Mo: Categorizing API Forum Discussions
3:35PM to 3:50PMBahar Sateli, Elian Angius, Srinivasan Rajivelu and René Witte: Can Text Mining Assistants Help to Improve Requirements
3:50PM to 4:00PMDiscussions on Papers
4:00PM to 4:30PMFishbowl Panel Discussions on Mining Unstructured Data
4:30PM to 4:45PMWrap-Up and "The Future of MUD"

Keynote by Rocco Oliveto (University of Molise)

Not Only Statements: The Role of Textual Analysis in Software Quality

Abstract: Source code lexicon (identifier names and comments) plays an essential role in program comprehension, especially when (i) the high-level documentation is scarce or outdated, or when (ii) the source code is complex enough such that the lexicon would tell more to developers than what the code semantics would do. In past and recent years, several researchers analyzed the role of lexicon in program comprehension. Furthermore, software lexicon has been used - as an alternative or as a complement to source code structure - to perform various kinds of analyses, for example traceability recovery, change impact analysis, clone detection, feature location, cohesion and coupling computation. Textual analysis of source code has the advantage of being lightweight (as it does not require parsing), and also of providing complementary information to what traditional code analysis, e.g., the extraction of structural information, could provide. All these successful applications increased in the recent years the interest in using textual analysis for improving and assessing the quality of a software system. In particular, textual analysis could be used to identify specific refactoring operations or ambiguous identifiers that may increase the program comprehension burden by creating a mismatch between the developers' cognitive model and the intended meaning of the term, thus ultimately increasing the risk of fault proneness. In addition, when used “on-line” during software development, textual analysis could guide the programmers to select better identifiers aiming at improving the quality of the source code lexicon. In this talk, I will overview research in text analysis for the assessment and the improvement of software and discuss our achievements to date, the challenges, and the opportunities for the future.