Sep24 '18

with Tianpei Xia, NC State University

Software analytics has been widely used in software engineering for many tasks such as generating effort estimates for software projects. One of the magical parts of software analytics is tuning the parameters controlling predicting algorithms. Such hyperparameter optimization has been widely studied in some software analytics domains (e.g. defect prediction and text mining). But, so far, has not been extensively explored for effort estimation. Accordingly, we try to seek simple, automatic, effective and fast methods for finding hyperparameter options for automatic software effort estimation.

We introduce a hyperparameter optimization architecture called RATE (Rapid Automatic Tuning for Effort-estimation), a novel configuration tool for effort estimation based on the combination of regression tree learners (e.g. CART) and optimizers (e.g. Different Evolution, Bayesian Optimization). We test RATE on a wide range of hyperparameter optimizers using data from 945 software projects, and compare our work against multiple baseline methods for software effort estimation including Automatically Transformed Linear Model (ATLM) and LP4EE. Results show that after tuning, large improvements in effort estimation accuracy were observed (measured in terms of the magnitude of the relative error (MRE) and standardized accuracy(SA)) with relatively very small computational cost.

From those results, we recommend to use RATE as a standard component of software effort estimation in the future.

Full paper can be accessed here