Detecting bug severity level using machine learning techniques

Other Title(s)

تعيين مستوى الخطورة للأخطاء باستخدام تقنيات التعلم الآلي

Dissertant

al-Jundi, Hamzah

Thesis advisor

Murad, Sharifah

University

Middle East University

Faculty

Faculty of Information Technology

Department

Computer Science Department

University Country

Jordan

Degree

Master

Degree Date

2021

Arabic Abstract

يعرف خطأ البرنامج بأنه مجموعة المشاكل التي تحدث خلال مراحل بناء المشروع و التي تؤدي إلى نتيجة غير صحيحة أو غير متوقعة.

في عملية اختبار البرمجيات، تعد المرحلة الرئيسية هي التنبؤ بخطورة تقارير الأخطاء.

و مع ذلك، يحتاج تصنيف تقارير الأخطاء يدويا إلى وقت و موارد من ذوي الخبرة.

مما يؤدي الى تأخير إصلاح الأخطاء ذات الأولوية العالية.

في هذه الاطروحة، تم إقتراح إطاراً لتعيين مستوى الخطورة المناسب لتقارير الأخطاء، بإسناد قيمة لخطورة تقرير الخطأ، و الهدف من هذا الإطار هو تجنب استغراق الوقت المستهلك أثناء تعيين خطورة الاخطاء بشكل يدوي بالاضافة الى تحسين الدقة و الفعالية في التنبؤ خطورة تقارير الأخطاء.

تم التحقق من فعالية هذا الإطار و صحته بتجربته على مجموعات بيانات مستخرجة من JIRA باستخدام لوحة معلومات شركة TETCO و هو مشروع مغلق المصدر لم يتم إستخدامه في أبحاث سابقة، و يحتوي على 2355 تقرير خطأ للحصول على أداء أفضل و تحقيق دقة أعلى.تم إجراء التجارب على مجموعة البيانات الحقيقية من خلال التعلم العميق بإستخدام خوارزميتين و هما : الذاكرة العصبية طويلة المدى (LSTM)، و.

(RNN).

تشير نتائج تجربتنا الى أن إطار العمل الخاص بتعيين مستوى الشدة المناسب لتقارير الأخطاء و الذي يستند الى التعلم العميق، بأنه يتنبأ بخطورة تقارير الأخطاء بدقة مرتفعة، حيث أظهرت النتائج نسبة التنبؤ بمستوى الخطورة إستناداً الى LSTM تصل الى: 0.858، أما نسبة التنبؤ بمستوى الخطورة إستناداً الى RNN تصل الى 0.58، مما يعني أن خوارزمية LSTM كانت الأكثر دقة في التنبؤ بمستوى الخطورة المناسب لتقارير الأخطاء مقارنة بخوارزمية RNN.

English Abstract

Software maintenance is the process of modifying a component or system after delivery, in order to correct defects, improve quality characteristics, or adapt to a changing environment (ISTQB, 2019).

To reduce maintenance cost the quality assurance engineers ensure that the software meets the requirements of the software owner and the user perspective by applying some testing techniques, such as usability testing, and performance testing.

When the testing team finds a bug, the bug reported to the development team, and after the bug is resolved, the testing team should re-test the reported bugs.

This process will repeat each time the quality assurance team members find any bug.

Bugs report should contain all the needed information to the developers, such as the steps to reproduce the bug, the bug priority and severity, and a brief description of it.

The most common point that makes software quality tester and developers' life harder is the limitation of time and human resources, which may lead him/her to discard some of the reported bugs, to take care of bugs that are more critical.

This study aims to overcome the mentioned problems, by automating the whole process of assigning the severity level on newly reported bugs to replace the manual severity assigning.

This thesis focuses on the detection of bugs severity (sever or non-sever), using machine learning approach, the features of the bugs report will be cleaned using text mining techniques such as (tokenization, stemming), and then a comparison between (LSTM and RNN) to evaluate which technique is giving the best result in assigning bugs severity.

The implementation divided into four main phases, in the first phase, the data set will extracted, then in the second step, dataset pre-processing will be done, the third phase is feature selection and in the last phase, the framework will propose a prediction and it called the prediction phase.

The bug reports dataset extracted from the repository of JIRA related to closed-source projects developed by TETCO Company located in Riyadh, Saudi Arabia; the datasets mainly contain four features including project name, bug id, bug description, and the severity level of the bug.

After model training, the different evaluation measures used for evaluating model performance.

According to the experimental results, we achieved a better result using the LSTM neural network instead RNN.

Main Subjects

Information Technology and Computer Science

No. of Pages

Table of contents.

Abstract.

Abstract in Arabic.

Chapter One : Introduction.

Chapter Two : Background and related work.

Chapter Three : Methodology.

Chapter Four : Experimental results.

Chapter Five : Conclusions and recommendations.

References.

American Psychological Association (APA)

al-Jundi, Hamzah. (2021). Detecting bug severity level using machine learning techniques. (Master's theses Theses and Dissertations Master). Middle East University, Jordan
https://search.emarefa.net/detail/BIM-1403043

Modern Language Association (MLA)

al-Jundi, Hamzah. Detecting bug severity level using machine learning techniques. (Master's theses Theses and Dissertations Master). Middle East University. (2021).
https://search.emarefa.net/detail/BIM-1403043

American Medical Association (AMA)

Language

English

Data Type

Arab Theses

Record ID

BIM-1403043

SaveSaved Print

Arab Citation & Impact Factor "Arcif"

Largest Arabic Database of Citations Analysis for the Arabic Scholarly Journals Issued in Arab World.

eMarefa Indicators
for Arab Scientific Production

"Kashif" for Checking Similarity or Plagiarism in the Arabic Researches. know more

e-Marefa