A New Framework Consisted of Data Preprocessing and Classifier Modelling for Software Defect Prediction

Joint Authors

Huang, Song
Ji, Haijin

Source

Mathematical Problems in Engineering

Issue

Vol. 2018, Issue 2018 (31 Dec. 2018), pp.1-13, 13 p.

Publisher

Hindawi Publishing Corporation

Publication Date

2018-09-06

Country of Publication

Egypt

No. of Pages

13

Main Subjects

Civil Engineering

Abstract EN

Different data preprocessing methods and classifiers have been established and evaluated earlier for the software defect prediction (SDP) across projects.

These novel approaches have provided relatively acceptable prediction results for different software projects.

However, to the best of our knowledge, few researchers have combined data preprocessing and building robust classifier simultaneously to improve prediction performances in SDP.

Therefore, this paper presents a new whole framework for predicting fault-prone software modules.

The proposed framework consists of instance filtering, feature selection, instance reduction, and establishing a new classifier.

Additionally, we find that the 21 main software metrics commonly do follow nonnormal distribution after performing a Kolmogorov-Smirnov test.

Therefore, the newly proposed classifier is built on the maximum correntropy criterion (MCC).

The MCC is well-known for its effectiveness in handling non-Gaussian noise.

To evaluate the new framework, the experimental study is designed with due care using nine open-source software projects with their 32 releases, obtained from the PROMISE data repository.

The prediction accuracy is evaluated using F-measure.

The state-of-the-art methods for Cross-Project Defect Prediction are also included for comparison.

All of the evidences derived from the experimentation verify the effectiveness and robustness of our new framework.

American Psychological Association (APA)

Ji, Haijin& Huang, Song. 2018. A New Framework Consisted of Data Preprocessing and Classifier Modelling for Software Defect Prediction. Mathematical Problems in Engineering،Vol. 2018, no. 2018, pp.1-13.
https://search.emarefa.net/detail/BIM-1209727

Modern Language Association (MLA)

Ji, Haijin& Huang, Song. A New Framework Consisted of Data Preprocessing and Classifier Modelling for Software Defect Prediction. Mathematical Problems in Engineering No. 2018 (2018), pp.1-13.
https://search.emarefa.net/detail/BIM-1209727

American Medical Association (AMA)

Ji, Haijin& Huang, Song. A New Framework Consisted of Data Preprocessing and Classifier Modelling for Software Defect Prediction. Mathematical Problems in Engineering. 2018. Vol. 2018, no. 2018, pp.1-13.
https://search.emarefa.net/detail/BIM-1209727

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references

Record ID

BIM-1209727