High-Performance Machine Learning for Large-Scale Data Classification considering Class Imbalance
المؤلفون المشاركون
Wang, Xi
Chen, Xianbang
Li, Huaqiang
Li, Xiang
Liu, Yang
المصدر
العدد
المجلد 2020، العدد 2020 (31 ديسمبر/كانون الأول 2020)، ص ص. 1-16، 16ص.
الناشر
Hindawi Publishing Corporation
تاريخ النشر
2020-05-18
دولة النشر
مصر
عدد الصفحات
16
التخصصات الرئيسية
الملخص EN
Currently, data classification is one of the most important ways to analysis data.
However, along with the development of data collection, transmission, and storage technologies, the scale of the data has been sharply increased.
Additionally, due to multiple classes and imbalanced data distribution in the dataset, the class imbalance issue is also gradually highlighted.
The traditional machine learning algorithms lack of abilities for handling the aforementioned issues so that the classification efficiency and precision may be significantly impacted.
Therefore, this paper presents an improved artificial neural network in enabling the high-performance classification for the imbalanced large volume data.
Firstly, the Borderline-SMOTE (synthetic minority oversampling technique) algorithm is employed to balance the training dataset, which potentially aims at improving the training of the back propagation neural network (BPNN), and then, zero-mean, batch-normalization, and rectified linear unit (ReLU) are further employed to optimize the input layer and hidden layers of BPNN.
At last, the ensemble learning-based parallelization of the improved BPNN is implemented using the Hadoop framework.
Positive conclusions can be summarized according to the experimental results.
Benefitting from Borderline-SMOTE, the imbalanced training dataset can be balanced, which improves the training performance and the classification accuracy.
The improvements for the input layer and hidden layer also enhance the training performances in terms of convergence.
The parallelization and the ensemble learning techniques enable BPNN to implement the high-performance large-scale data classification.
The experimental results show the effectiveness of the presented classification algorithm.
نمط استشهاد جمعية علماء النفس الأمريكية (APA)
Liu, Yang& Li, Xiang& Chen, Xianbang& Wang, Xi& Li, Huaqiang. 2020. High-Performance Machine Learning for Large-Scale Data Classification considering Class Imbalance. Scientific Programming،Vol. 2020, no. 2020, pp.1-16.
https://search.emarefa.net/detail/BIM-1208986
نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)
Liu, Yang…[et al.]. High-Performance Machine Learning for Large-Scale Data Classification considering Class Imbalance. Scientific Programming No. 2020 (2020), pp.1-16.
https://search.emarefa.net/detail/BIM-1208986
نمط استشهاد الجمعية الطبية الأمريكية (AMA)
Liu, Yang& Li, Xiang& Chen, Xianbang& Wang, Xi& Li, Huaqiang. High-Performance Machine Learning for Large-Scale Data Classification considering Class Imbalance. Scientific Programming. 2020. Vol. 2020, no. 2020, pp.1-16.
https://search.emarefa.net/detail/BIM-1208986
نوع البيانات
مقالات
لغة النص
الإنجليزية
الملاحظات
Includes bibliographical references
رقم السجل
BIM-1208986
قاعدة معامل التأثير والاستشهادات المرجعية العربي "ارسيف Arcif"
أضخم قاعدة بيانات عربية للاستشهادات المرجعية للمجلات العلمية المحكمة الصادرة في العالم العربي
تقوم هذه الخدمة بالتحقق من التشابه أو الانتحال في الأبحاث والمقالات العلمية والأطروحات الجامعية والكتب والأبحاث باللغة العربية، وتحديد درجة التشابه أو أصالة الأعمال البحثية وحماية ملكيتها الفكرية. تعرف اكثر