A novel feature selection method based on maximum likelihood logistic regression for imbalanced learning in software defect prediction
المؤلفون المشاركون
Bashir, Kamal
Li, Tianrui
Yahya, Mahama
المصدر
The International Arab Journal of Information Technology
العدد
المجلد 17، العدد 5 (30 سبتمبر/أيلول 2020)، ص ص. 721-730، 10ص.
الناشر
جامعة الزرقاء عمادة البحث العلمي
تاريخ النشر
2020-09-30
دولة النشر
الأردن
عدد الصفحات
10
التخصصات الرئيسية
الملخص EN
The most frequently used machine learning feature ranking approaches failed to present optimal feature subset for accurate prediction of defective software modules in out-of-sample data.
Machine learning Feature Selection (FS) algorithms such as Chi-Square (CS), Information Gain (IG), Gain Ratio (GR), RelieF (RF) and Symmetric Uncertainty (SU) perform relatively poor at prediction, even after balancing class distribution in the training data.
In this study, we propose a novel FS method based on the Maximum Likelihood Logistic Regression (MLLR).
We apply this method on six software defect datasets in their sampled and unsampled forms to select useful features for classification in the context of Software Defect Prediction (SDP).
The Support Vector Machine (SVM) and Random Forest (RaF) classifiers are applied on the FS subsets that are based on sampled and unsampled datasets.
The performance of the models captured using Area Ander Receiver Operating Characteristics Curve (AUC) metrics are compared for all FS methods considered.
The Analysis of Variance (ANOVA) F-test results validate the superiority of the proposed method over all the FS techniques, both in sampled and unsampled data.
The results confirm that the MLLR can be useful in selecting optimal feature subset for more accurate prediction of defective modules in software development process.
نمط استشهاد جمعية علماء النفس الأمريكية (APA)
Bashir, Kamal& Li, Tianrui& Yahya, Mahama. 2020. A novel feature selection method based on maximum likelihood logistic regression for imbalanced learning in software defect prediction. The International Arab Journal of Information Technology،Vol. 17, no. 5, pp.721-730.
https://search.emarefa.net/detail/BIM-1439746
نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)
Bashir, Kamal…[et al.]. A novel feature selection method based on maximum likelihood logistic regression for imbalanced learning in software defect prediction. The International Arab Journal of Information Technology Vol. 17, no. 5 (Sep. 2020), pp.721-730.
https://search.emarefa.net/detail/BIM-1439746
نمط استشهاد الجمعية الطبية الأمريكية (AMA)
Bashir, Kamal& Li, Tianrui& Yahya, Mahama. A novel feature selection method based on maximum likelihood logistic regression for imbalanced learning in software defect prediction. The International Arab Journal of Information Technology. 2020. Vol. 17, no. 5, pp.721-730.
https://search.emarefa.net/detail/BIM-1439746
نوع البيانات
مقالات
لغة النص
الإنجليزية
الملاحظات
Includes bibliographical references : p. 729-730
رقم السجل
BIM-1439746
قاعدة معامل التأثير والاستشهادات المرجعية العربي "ارسيف Arcif"
أضخم قاعدة بيانات عربية للاستشهادات المرجعية للمجلات العلمية المحكمة الصادرة في العالم العربي
تقوم هذه الخدمة بالتحقق من التشابه أو الانتحال في الأبحاث والمقالات العلمية والأطروحات الجامعية والكتب والأبحاث باللغة العربية، وتحديد درجة التشابه أو أصالة الأعمال البحثية وحماية ملكيتها الفكرية. تعرف اكثر