Analyzing Performance of Tree Based Regressors on Defect Prediction Datasets

Main Article Content

Mayank Yadav
Dr. Ruchika Malhotra

Abstract

Defect prediction is a field capable of having enormous amount of research potential. During
testing activity, software shows occurrences of multiple defects. And, that too capable of
causing instant failures; thereby decreasing the software’s capability. Ultimately, this leads to
a decrease in software’s service and the concerned customers are left unsatisfied. Due to the
availability of public datasets, one can easily get their access. And, with the help of in-depth
knowledge or application ability of multiple algorithms; might come up with models suited
for efficient defect prediction tasks. The biggest challenge of any researcher is too first
analysing the dataset in hand and then the selection of appropriate technique to get the
outcomes which are desired. In this paper, the focus is laid upon regression technique. To be
more precise, we have used the available tree regressors to predict presence of defects. On a
further stage, we have shown that we can also use them to even predict the number of defects
present in datasets. Tree regressors used include decision tree (DT) & extra trees (ET)
regressors. These regressors have been analysed on the basis of their performance in terms of
R-square coefficient of determination, mean squared error and root mean squared error. The
datasets used have been taken from PROMISE repository. They are ECLIPSE data referring
to open-source java systems. We have chosen the ant dataset. We have selected it and the
different versions of it. Regressors will be trained and tested on this dataset only.

Article Details

Section
Articles