Workshop Program

Time	Event
1:15PM to 1:30PM	Workshop Introduction (Organizers)
1:30PM to 2:30PM	Keynote by Rocco Oliveto and Discussion
2:30PM to 2:45PM	Keynote Discussion Session
2:45PM to 3:15PM	BREAK
3:15PM to 3:30PM	Daqing Hou and Lingfeng Mo: Categorizing API Forum Discussions
3:35PM to 3:50PM	Bahar Sateli, Elian Angius, Srinivasan Rajivelu and René Witte: Can Text Mining Assistants Help to Improve Requirements
3:50PM to 4:00PM	Discussions on Papers
4:00PM to 4:30PM	Fishbowl Panel Discussions on Mining Unstructured Data
4:30PM to 4:45PM	Wrap-Up and "The Future of MUD"

Keynote by Rocco Oliveto (University of Molise)

Not Only Statements: The Role of Textual Analysis in Software Quality

Abstract: Source code lexicon (identifier names and comments) plays an essential role in program comprehension, especially when (i) the high-level documentation is scarce or outdated, or when (ii) the source code is complex enough such that the lexicon would tell more to developers than what the code semantics would do. In past and recent years, several researchers analyzed the role of lexicon in program comprehension. Furthermore, software lexicon has been used - as an alternative or as a complement to source code structure - to perform various kinds of analyses, for example traceability recovery, change impact analysis, clone detection, feature location, cohesion and coupling computation. Textual analysis of source code has the advantage of being lightweight (as it does not require parsing), and also of providing complementary information to what traditional code analysis, e.g., the extraction of structural information, could provide. All these successful applications increased in the recent years the interest in using textual analysis for improving and assessing the quality of a software system. In particular, textual analysis could be used to identify specific refactoring operations or ambiguous identifiers that may increase the program comprehension burden by creating a mismatch between the developers' cognitive model and the intended meaning of the term, thus ultimately increasing the risk of fault proneness. In addition, when used “on-line” during software development, textual analysis could guide the programmers to select better identifiers aiming at improving the quality of the source code lexicon. In this talk, I will overview research in text analysis for the assessment and the improvement of software and discuss our achievements to date, the challenges, and the opportunities for the future.