An Improved Method for Cross-Project Defect Prediction by Simplifying Training Data
المؤلفون المشاركون
He, Peng
Li, Bing
He, Yao
Yu, Lvjun
المصدر
Mathematical Problems in Engineering
العدد
المجلد 2018، العدد 2018 (31 ديسمبر/كانون الأول 2018)، ص ص. 1-18، 18ص.
الناشر
Hindawi Publishing Corporation
تاريخ النشر
2018-06-07
دولة النشر
مصر
عدد الصفحات
18
التخصصات الرئيسية
الملخص EN
Cross-project defect prediction (CPDP) on projects with limited historical data has attracted much attention.
To the best of our knowledge, however, the performance of existing approaches is usually poor, because of low quality cross-project training data.
The objective of this study is to propose an improved method for CPDP by simplifying training data, labeled as TDSelector, which considers both the similarity and the number of defects that each training instance has (denoted by defects), and to demonstrate the effectiveness of the proposed method.
Our work consists of three main steps.
First, we constructed TDSelector in terms of a linear weighted function of instances’ similarity and defects.
Second, the basic defect predictor used in our experiments was built by using the Logistic Regression classification algorithm.
Third, we analyzed the impacts of different combinations of similarity and the normalization of defects on prediction performance and then compared with two existing methods.
We evaluated our method on 14 projects collected from two public repositories.
The results suggest that the proposed TDSelector method performs, on average, better than both baseline methods, and the AUC values are increased by up to 10.6% and 4.3%, respectively.
That is, the inclusion of defects is indeed helpful to select high quality training instances for CPDP.
On the other hand, the combination of Euclidean distance and linear normalization is the preferred way for TDSelector.
An additional experiment also shows that selecting those instances with more bugs directly as training data can further improve the performance of the bug predictor trained by our method.
نمط استشهاد جمعية علماء النفس الأمريكية (APA)
He, Peng& He, Yao& Yu, Lvjun& Li, Bing. 2018. An Improved Method for Cross-Project Defect Prediction by Simplifying Training Data. Mathematical Problems in Engineering،Vol. 2018, no. 2018, pp.1-18.
https://search.emarefa.net/detail/BIM-1206378
نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)
He, Peng…[et al.]. An Improved Method for Cross-Project Defect Prediction by Simplifying Training Data. Mathematical Problems in Engineering No. 2018 (2018), pp.1-18.
https://search.emarefa.net/detail/BIM-1206378
نمط استشهاد الجمعية الطبية الأمريكية (AMA)
He, Peng& He, Yao& Yu, Lvjun& Li, Bing. An Improved Method for Cross-Project Defect Prediction by Simplifying Training Data. Mathematical Problems in Engineering. 2018. Vol. 2018, no. 2018, pp.1-18.
https://search.emarefa.net/detail/BIM-1206378
نوع البيانات
مقالات
لغة النص
الإنجليزية
الملاحظات
Includes bibliographical references
رقم السجل
BIM-1206378
قاعدة معامل التأثير والاستشهادات المرجعية العربي "ارسيف Arcif"
أضخم قاعدة بيانات عربية للاستشهادات المرجعية للمجلات العلمية المحكمة الصادرة في العالم العربي
تقوم هذه الخدمة بالتحقق من التشابه أو الانتحال في الأبحاث والمقالات العلمية والأطروحات الجامعية والكتب والأبحاث باللغة العربية، وتحديد درجة التشابه أو أصالة الأعمال البحثية وحماية ملكيتها الفكرية. تعرف اكثر