Evaluation of different data mining algorithms with KDD CUP 99 data set

Joint Authors

al-Maymuri, Safa O.
Jasim, Firas S.

Source

Journal of Babylon University : Journal of Applied and Pure Sciences

Issue

Vol. 21, Issue 8 (31 Dec. 2013), pp.2663-2681, 19 p.

Publisher

University of Babylon

Publication Date

2013-12-31

Country of Publication

Iraq

No. of Pages

19

Main Subjects

Information Technology and Computer Science

Topics

Abstract AR

تعدين البيانات هي واحدة من التقنيات الحديثة لتحليل البيانات الضخمة مثل بيانات KDD CUP 99 و المتخصصة في مجال اكتشاف الاختراقات.

الهدف من البحث هو استعراض و تقييم لخوارزميات تعدين البيانات و التي تم تطبيقها على بيانات KDD CUP 99 لتصنيف الهجومات و قياس و قياس النتائج من ناحية الدقة و السرعة هذا من جانب, و من جانب آخر أفضل خوارزمية تصنيف مع هذه البيانات.

أظهرت النتائج أن خوارزميات تعدين البيانات تتفاوت في اكتشاف الهجومات و تحديد صنفها.

خوارزمية الغابات العشوائية كانت صاحبة أعلى نسبة اكتشاف بالنسبة لهجمات ال DOS بينما خوارزمية المنطق المضبب صنفت هجوما ال Probe بنسبة عالية.

هجومات R2U و R2L تم تصنيفها بشكل جيد من قبل خوارزمية MARS, المنطق المضبب, و مصنف الأشجار العشوائية على التوالي.

خوارزمية MARS كانت صاحبة أعلى دقة في التصنيف بينما كانت خوارزمية PART رديئة جدا.

خوارزمية ONER تم تدريبها بأقل وقت بينما خوارزمية المنطق المضبب و خوارزمية MLP تدربت ببطء.

Abstract EN

Data mining is the modern technique for analysis of huge of data such as KDD CUP 99 data set that is applied in network intrusion detection.

Large amount of data can be handled with the data mining technology.

It is still in developing state, it can become more effective as it is growing rapidly.

Our work in this paper survey is for the most algorithms Data Mining using KDD CUP 99 data set in the classification of attacks and compared their results which have been reached, and being used of the performance measurement such as, True Positive Rate (TP), False Alarm Rate (FP), Percentage of Successful Prediction (PSP) and training time (TT) to show the results, the reason for this survey is to compare the results and select the best system for detecting intrusion (classification).

The results showed that the Data Mining algorithms differ in the proportion of determining the rate of the attack, according to its type.

The algorithm Random Forest Classifier detection is the highest rate of attack of the DOS, While Fuzzy Logic algorithm was the highest in detection Probe attack.

The two categories R2U and R2L attacks have been identified well by using an MARS, Fuzzy logic and Random Forest classifiers respectively.

MARS getting higher accuracy in classification, while PART classification algorithm got less accuracy.

OneR got the least training time, otherwise Fuzzy Logic algorithm and MLP algorithm got higher training time.

American Psychological Association (APA)

al-Maymuri, Safa O.& Jasim, Firas S.. 2013. Evaluation of different data mining algorithms with KDD CUP 99 data set. Journal of Babylon University : Journal of Applied and Pure Sciences،Vol. 21, no. 8, pp.2663-2681.
https://search.emarefa.net/detail/BIM-345528

Modern Language Association (MLA)

al-Maymuri, Safa O.& Jasim, Firas S.. Evaluation of different data mining algorithms with KDD CUP 99 data set. Journal of Babylon University : Journal of Applied and Pure Sciences Vol. 21, no. 8 (2013), pp.2663-2681.
https://search.emarefa.net/detail/BIM-345528

American Medical Association (AMA)

al-Maymuri, Safa O.& Jasim, Firas S.. Evaluation of different data mining algorithms with KDD CUP 99 data set. Journal of Babylon University : Journal of Applied and Pure Sciences. 2013. Vol. 21, no. 8, pp.2663-2681.
https://search.emarefa.net/detail/BIM-345528

Data Type

Journal Articles

Language

English

Notes

Text in English ; abstracts in English and Arabic.

Record ID

BIM-345528