Practical Web Spam Lifelong Machine Learning System with Automatic Adjustment to Current Lifecycle Phase

المؤلف

Luckner, Marcin

المصدر

Security and Communication Networks

العدد

المجلد 2019، العدد 2019 (31 ديسمبر/كانون الأول 2019)، ص ص. 1-16، 16ص.

الناشر

Hindawi Publishing Corporation

تاريخ النشر

2019-02-20

دولة النشر

مصر

عدد الصفحات

16

التخصصات الرئيسية

تكنولوجيا المعلومات وعلم الحاسوب

الملخص EN

Machine learning techniques are a standard approach in spam detection.

Their quality depends on the quality of the learning set, and when the set is out of date, the quality of classification falls rapidly.

The most popular public web spam dataset that can be used to train a spam detector—WEBSPAM-UK2007—is over ten years old.

Therefore, there is a place for a lifelong machine learning system that can replace the detectors based on a static learning set.

In this paper, we propose a novel web spam recognition system.

The system automatically rebuilds the learning set to avoid classification based on outdated data.

Using a built-in automatic selection of the active classifier the system very quickly attains productive accuracy despite a limited learning set.

Moreover, the system automatically rebuilds the learning set using external data from spam traps and popular web services.

A test on real data from Quora, Reddit, and Stack Overflow proved the high recognition quality.

Both the obtained average accuracy and the F-measure were 0.98 and 0.96 for semiautomatic and full–automatic mode, respectively.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Luckner, Marcin. 2019. Practical Web Spam Lifelong Machine Learning System with Automatic Adjustment to Current Lifecycle Phase. Security and Communication Networks،Vol. 2019, no. 2019, pp.1-16.
https://search.emarefa.net/detail/BIM-1210522

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Luckner, Marcin. Practical Web Spam Lifelong Machine Learning System with Automatic Adjustment to Current Lifecycle Phase. Security and Communication Networks No. 2019 (2019), pp.1-16.
https://search.emarefa.net/detail/BIM-1210522

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Luckner, Marcin. Practical Web Spam Lifelong Machine Learning System with Automatic Adjustment to Current Lifecycle Phase. Security and Communication Networks. 2019. Vol. 2019, no. 2019, pp.1-16.
https://search.emarefa.net/detail/BIM-1210522

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references

رقم السجل

BIM-1210522