Predicting Coronavirus Pandemic in Real-Time Using Machine Learning and Big Data Streaming System
المؤلفون المشاركون
Sahal, Radhya
Zhang, Xiongwei
Saleh, Hager
Younis, Eman M. G.
Ali, Abdelmgeid A.
المصدر
العدد
المجلد 2020، العدد 2020 (31 ديسمبر/كانون الأول 2020)، ص ص. 1-10، 10ص.
الناشر
Hindawi Publishing Corporation
تاريخ النشر
2020-12-22
دولة النشر
مصر
عدد الصفحات
10
التخصصات الرئيسية
الملخص EN
Twitter is a virtual social network where people share their posts and opinions about the current situation, such as the coronavirus pandemic.
It is considered the most significant streaming data source for machine learning research in terms of analysis, prediction, knowledge extraction, and opinions.
Sentiment analysis is a text analysis method that has gained further significance due to social networks’ emergence.
Therefore, this paper introduces a real-time system for sentiment prediction on Twitter streaming data for tweets about the coronavirus pandemic.
The proposed system aims to find the optimal machine learning model that obtains the best performance for coronavirus sentiment analysis prediction and then uses it in real-time.
The proposed system has been developed into two components: developing an offline sentiment analysis and modeling an online prediction pipeline.
The system has two components: the offline and the online components.
For the offline component of the system, the historical tweets’ dataset was collected in duration 23/01/2020 and 01/06/2020 and filtered by #COVID-19 and #Coronavirus hashtags.
Two feature extraction methods of textual data analysis were used, n-gram and TF-ID, to extract the dataset’s essential features, collected using coronavirus hashtags.
Then, five regular machine learning algorithms were performed and compared: decision tree, logistic regression, k-nearest neighbors, random forest, and support vector machine to select the best model for the online prediction component.
The online prediction pipeline was developed using Twitter Streaming API, Apache Kafka, and Apache Spark.
The experimental results indicate that the RF model using the unigram feature extraction method has achieved the best performance, and it is used for sentiment prediction on Twitter streaming data for coronavirus.
نمط استشهاد جمعية علماء النفس الأمريكية (APA)
Zhang, Xiongwei& Saleh, Hager& Younis, Eman M. G.& Sahal, Radhya& Ali, Abdelmgeid A.. 2020. Predicting Coronavirus Pandemic in Real-Time Using Machine Learning and Big Data Streaming System. Complexity،Vol. 2020, no. 2020, pp.1-10.
https://search.emarefa.net/detail/BIM-1143250
نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)
Zhang, Xiongwei…[et al.]. Predicting Coronavirus Pandemic in Real-Time Using Machine Learning and Big Data Streaming System. Complexity No. 2020 (2020), pp.1-10.
https://search.emarefa.net/detail/BIM-1143250
نمط استشهاد الجمعية الطبية الأمريكية (AMA)
Zhang, Xiongwei& Saleh, Hager& Younis, Eman M. G.& Sahal, Radhya& Ali, Abdelmgeid A.. Predicting Coronavirus Pandemic in Real-Time Using Machine Learning and Big Data Streaming System. Complexity. 2020. Vol. 2020, no. 2020, pp.1-10.
https://search.emarefa.net/detail/BIM-1143250
نوع البيانات
مقالات
لغة النص
الإنجليزية
الملاحظات
Includes bibliographical references
رقم السجل
BIM-1143250
قاعدة معامل التأثير والاستشهادات المرجعية العربي "ارسيف Arcif"
أضخم قاعدة بيانات عربية للاستشهادات المرجعية للمجلات العلمية المحكمة الصادرة في العالم العربي
تقوم هذه الخدمة بالتحقق من التشابه أو الانتحال في الأبحاث والمقالات العلمية والأطروحات الجامعية والكتب والأبحاث باللغة العربية، وتحديد درجة التشابه أو أصالة الأعمال البحثية وحماية ملكيتها الفكرية. تعرف اكثر