My Ph.D. thesis abstract

Developing analytics models for software project management 

Schedule and cost overruns constitute a major problem in software projects and have been a source of concern for the software engineering community for a long time. Managing software projects to meet time and cost constraints is highly challenging, due to the dynamic nature of software development. Software project management covers a range of complex activities such as project planning, progress monitoring, and risk management. These tasks require project managers the capability to make correct estimations and foresee future significant events in their projects. However, there has been little work on providing automated support for software project management activities. Modern software projects mostly follow the iterative development process where software products are incrementally developed and evolved in a series of iterations. Each iteration requires the completion and resolution of a number of issues such as bugs, improvement or new feature requests. Since modern software projects require continuous deliveries in every iteration of software development, it is essential to monitor the execution of iterations and the resolution/completion of issues, and make reliable predictions. There is thus a strong need to provide the project managers, software engineers, and other stakeholders with predictive support at the level of iterations and issues. This thesis aims to leverage a large amount of data from software projects to generate actionable insights that are valuable for different software project management activities at the level of iterations and issues. Using cutting-edge machine learning technology (including deep learning), we develop a novel suite of data analytics techniques and models for predicting delivery capability of ongoing iterations, predicting issue delays, and estimating the effort for resolving issues. An extensive empirical evaluation using data from over ten large well-known software projects (e.g., Apache, Duraspace, Java.net, JBoss, JIRA, Moodle, and Mulesoft) demonstrates the high effectiveness of our approach.