WSF2: A Novel Framework for Filtering Web Spam

Joint Authors

Laza, Rosalía
Fdez-Riverola, Florentino
Ruano-Ordás, D.
Méndez, J. R.
Fdez-Glez, J.
Pavón, R.

Source

Scientific Programming

Issue

Vol. 2016, Issue 2016 (31 Dec. 2016), pp.1-18, 18 p.

Publisher

Hindawi Publishing Corporation

Publication Date

2016-01-19

Country of Publication

Egypt

No. of Pages

18

Main Subjects

Mathematics

Abstract EN

Over the last years, research on web spam filtering has gained interest from both academia and industry.

In this context, although there are a good number of successful antispam techniques available (i.e., content-based, link-based, and hiding), an adequate combination of different algorithms supported by an advanced web spam filtering platform would offer more promising results.

To this end, we propose the WSF2 framework, a new platform particularly suitable for filtering spam content on web pages.

Currently, our framework allows the easy combination of different filtering techniques including, but not limited to, regular expressions and well-known classifiers (i.e., Naïve Bayes, Support Vector Machines, and C5.0).

Applying our WSF2 framework over the publicly available WEBSPAM-UK2007 corpus, we have been able to demonstrate that a simple combination of different techniques is able to improve the accuracy of single classifiers on web spam detection.

As a result, we conclude that the proposed filtering platform is a powerful tool for boosting applied research in this area.

American Psychological Association (APA)

Fdez-Glez, J.& Ruano-Ordás, D.& Laza, Rosalía& Méndez, J. R.& Pavón, R.& Fdez-Riverola, Florentino. 2016. WSF2: A Novel Framework for Filtering Web Spam. Scientific Programming،Vol. 2016, no. 2016, pp.1-18.
https://search.emarefa.net/detail/BIM-1118306

Modern Language Association (MLA)

Fdez-Glez, J.…[et al.]. WSF2: A Novel Framework for Filtering Web Spam. Scientific Programming No. 2016 (2016), pp.1-18.
https://search.emarefa.net/detail/BIM-1118306

American Medical Association (AMA)

Fdez-Glez, J.& Ruano-Ordás, D.& Laza, Rosalía& Méndez, J. R.& Pavón, R.& Fdez-Riverola, Florentino. WSF2: A Novel Framework for Filtering Web Spam. Scientific Programming. 2016. Vol. 2016, no. 2016, pp.1-18.
https://search.emarefa.net/detail/BIM-1118306

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references

Record ID

BIM-1118306