Valid Probabilistic Anomaly Detection Models for System Logs

المؤلفون المشاركون

Liu, Chunbo
Pan, Lanlan
Gu, Zhaojun
Wang, Jialiang
Ren, Yitong
Wang, Zhi

المصدر

Wireless Communications and Mobile Computing

العدد

المجلد 2020، العدد 2020 (31 ديسمبر/كانون الأول 2020)، ص ص. 1-12، 12ص.

الناشر

Hindawi Publishing Corporation

تاريخ النشر

2020-11-16

دولة النشر

مصر

عدد الصفحات

12

التخصصات الرئيسية

تكنولوجيا المعلومات وعلم الحاسوب

الملخص EN

System logs can record the system status and important events during system operation in detail.

Detecting anomalies in the system logs is a common method for modern large-scale distributed systems.

Yet threshold-based classification models used for anomaly detection output only two values: normal or abnormal, which lacks probability of estimating whether the prediction results are correct.

In this paper, a statistical learning algorithm Venn-Abers predictor is adopted to evaluate the confidence of prediction results in the field of system log anomaly detection.

It is able to calculate the probability distribution of labels for a set of samples and provide a quality assessment of predictive labels to some extent.

Two Venn-Abers predictors LR-VA and SVM-VA have been implemented based on Logistic Regression and Support Vector Machine, respectively.

Then, the differences among different algorithms are considered so as to build a multimodel fusion algorithm by Stacking.

And then a Venn-Abers predictor based on the Stacking algorithm called Stacking-VA is implemented.

The performances of four types of algorithms (unimodel, Venn-Abers predictor based on unimodel, multimodel, and Venn-Abers predictor based on multimodel) are compared in terms of validity and accuracy.

Experiments are carried out on a log dataset of the Hadoop Distributed File System (HDFS).

For the comparative experiments on unimodels, the results show that the validities of LR-VA and SVM-VA are better than those of the two corresponding underlying models.

Compared with the underlying model, the accuracy of the SVM-VA predictor is better than that of LR-VA predictor, and more significantly, the recall rate increases from 81% to 94%.

In the case of experiments on multiple models, the algorithm based on Stacking multimodel fusion is significantly superior to the underlying classifier.

The average accuracy of Stacking-VA is larger than 0.95, which is more stable than the prediction results of LR-VA and SVM-VA.

Experimental results show that the Venn-Abers predictor is a flexible tool that can make accurate and valid probability predictions in the field of system log anomaly detection.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Liu, Chunbo& Pan, Lanlan& Gu, Zhaojun& Wang, Jialiang& Ren, Yitong& Wang, Zhi. 2020. Valid Probabilistic Anomaly Detection Models for System Logs. Wireless Communications and Mobile Computing،Vol. 2020, no. 2020, pp.1-12.
https://search.emarefa.net/detail/BIM-1214626

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Liu, Chunbo…[et al.]. Valid Probabilistic Anomaly Detection Models for System Logs. Wireless Communications and Mobile Computing No. 2020 (2020), pp.1-12.
https://search.emarefa.net/detail/BIM-1214626

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Liu, Chunbo& Pan, Lanlan& Gu, Zhaojun& Wang, Jialiang& Ren, Yitong& Wang, Zhi. Valid Probabilistic Anomaly Detection Models for System Logs. Wireless Communications and Mobile Computing. 2020. Vol. 2020, no. 2020, pp.1-12.
https://search.emarefa.net/detail/BIM-1214626

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references

رقم السجل

BIM-1214626