Evolutionary Feature Selection for Big Data Classification: A MapReduce Approach

المؤلفون المشاركون

del Río, Sara
Peralta, Daniel
Ramírez-Gallego, Sergio
Triguero, Isaac
Benitez, Jose M.
Herrera, F.

المصدر

Mathematical Problems in Engineering

العدد

المجلد 2015، العدد 2015 (31 ديسمبر/كانون الأول 2015)، ص ص. 1-11، 11ص.

الناشر

Hindawi Publishing Corporation

تاريخ النشر

2015-10-12

دولة النشر

مصر

عدد الصفحات

11

التخصصات الرئيسية

هندسة مدنية

الملخص EN

Nowadays, many disciplines have to deal with big datasets that additionally involve a high number of features.

Feature selection methods aim at eliminating noisy, redundant, or irrelevant features that may deteriorate the classification performance.

However, traditional methods lack enough scalability to cope with datasets of millions of instances and extract successful results in a delimited time.

This paper presents a feature selection algorithm based on evolutionary computation that uses the MapReduce paradigm to obtain subsets of features from big datasets.

The algorithm decomposes the original dataset in blocks of instances to learn from them in the map phase; then, the reduce phase merges the obtained partial results into a final vector of feature weights, which allows a flexible application of the feature selection procedure using a threshold to determine the selected subset of features.

The feature selection method is evaluated by using three well-known classifiers (SVM, Logistic Regression, and Naive Bayes) implemented within the Spark framework to address big data problems.

In the experiments, datasets up to 67 millions of instances and up to 2000 attributes have been managed, showing that this is a suitable framework to perform evolutionary feature selection, improving both the classification accuracy and its runtime when dealing with big data problems.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Peralta, Daniel& del Río, Sara& Ramírez-Gallego, Sergio& Triguero, Isaac& Benitez, Jose M.& Herrera, F.. 2015. Evolutionary Feature Selection for Big Data Classification: A MapReduce Approach. Mathematical Problems in Engineering،Vol. 2015, no. 2015, pp.1-11.
https://search.emarefa.net/detail/BIM-1073312

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Peralta, Daniel…[et al.]. Evolutionary Feature Selection for Big Data Classification: A MapReduce Approach. Mathematical Problems in Engineering No. 2015 (2015), pp.1-11.
https://search.emarefa.net/detail/BIM-1073312

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Peralta, Daniel& del Río, Sara& Ramírez-Gallego, Sergio& Triguero, Isaac& Benitez, Jose M.& Herrera, F.. Evolutionary Feature Selection for Big Data Classification: A MapReduce Approach. Mathematical Problems in Engineering. 2015. Vol. 2015, no. 2015, pp.1-11.
https://search.emarefa.net/detail/BIM-1073312

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references

رقم السجل

BIM-1073312