Sep28 '18

with Amritanshu Agrawal, NC State

In recent years, hyperparameter optimization has shown to be highly promising. The field has shown to achieve better optimal solutions in various Software Engineering (SE) tasks but with an expense of heavy runtime. Two newly introduced frameworks, namely LDADE and SMOTUNED which uses DE (a simpler optimizer), achieve near optimal solutions with lesser expense in runtime (only in 100s rather than 1,000s, or 10,000s of evaluations). Further, another simpler method, FFT (Fast and Frugal Trees) achieved similar performances as SMOTUNED and LDADE but in only about 16 evaluations. Can we say that the results space of learning effectively contains just a few smaller regions of exploration? These smaller result spaces could be explored in fewer samples just as well, or better than more complex methods.

This pattern follows Deb’s principle of “ϵ-dominance” which states that if there exists some ϵ value below which it is useless or impossible to distinguish results, then it is superfluous to explore anything less than ϵ. This work explores the existence of ϵ-Dominance in SE tasks specifically in defect prediction. We assess about 85,000 different possible choices of preprocessors combined with learners. With a simple tabu “looking” search, achieves plateau in performances in less than 20 of those choices. These results explain why some of the previous work were able to achieve optimal performances with lesser and lesser evaluations. In future, it is worth exploring the existence of “ϵ-dominance” in text mining, effort estimation and many more.