Sentiment analysis of Arabic tweets-implicit semantic-based approach

العناوين الأخرى

تحليل المشاعر و الآراء لتغريدات (Tweets)‎ باعتماد أساس المعنى الضمني (المجازي)‎ للآراء

مقدم أطروحة جامعية

Ibrahim, Rasha Mahmud Ahmad

مشرف أطروحة جامعية

Ujan, Arafat
al-Kuz, Akram

الجامعة

جامعة الأميرة سمية للتكنولوجيا

الكلية

كلية الملك الحسين لعلوم الحوسبة

القسم الأكاديمي

قسم علم الحاسوب

دولة الجامعة

الأردن

الدرجة العلمية

ماجستير

تاريخ الدرجة العلمية

2018

الملخص الإنجليزي

Sentiment analysis is one of the natural language processing tasks aimed at automatically extracting writer opinion and detecting polarity from the written text.

Most studies have focused on sentiment analysis in English as a primary language on the Internet, while very few studies have addressed sentiment analysis in Arabic.

After the rapid growth of web content and the increasing number of Arabic users on the Internet, especially on social media sites, Arabic sentiment analysis gained the attention of researchers and motivated them to study and initiate tools that can identify and classify opinions that are represented on the Internet.

Several approaches to extract sentiment from text are available, most of which use lexical-based analysis or what is called a dictionary-based approach, which begins with a manual, predefined dictionary of positive and negative words and then uses word counts or other measures and frequency to score all the opinions in the data.

Lately, machine learning-based sentiment analysis has been used to extract polarity from text using unsupervised learning, supervised learning, or a hybrid approach.

The objectives of this thesis are to use a large corpus to build larger dictionaries in order to increase the accuracy of the classifier and to try to extract the polarity, depending on the words’ meanings, while considering that the same word can have many meanings in different contexts.

This thesis will describe a new approach for sentiment analysis in Arabic.

The approach used is Word2Vec, which is a group of related models that are used to produce Word Embeddings.

These models are shallow, two-layer neural networks that are trained to reconstruct the linguistic contexts of words.

Word2Vec takes as its input a large corpus of text—which are tweets in this study—and produces a vector space, typically of several hundred dimensions, with each unique word in the corpus being assigned a corresponding vector in the space.

Word vectors are positioned in the vector space such that words sharing common contexts in the corpus are located in close proximity to one another in the space.

The proposed architecture using the Word Embeddings technique and Neural Network classifier were promising and indicated that using such hybrid approaches affects the accuracy of extracting sentiment from Arabic text with implicit meaning.

Finally, the experiments and evaluations that were conducted in this thesis encourage the researchers to continue in this direction of research.

The performance of the proposed approach compared with other Hybrid Machine Learning methods achieved 93.1% accuracy.

Keywords:

التخصصات الرئيسية

تكنولوجيا المعلومات وعلم الحاسوب

الموضوعات

عدد الصفحات

79

قائمة المحتويات

Table of contents.

Abstract.

Abstract in Arabic.

Chapter One : Introduction.

Chapter Two : Background and related work.

Chapter Three : Methodology

Chapter Four : Experiments and results.

Chapter Five : Evaluations.

Chapter Six : Conclusion and future work.

References.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Ibrahim, Rasha Mahmud Ahmad. (2018). Sentiment analysis of Arabic tweets-implicit semantic-based approach. (Master's theses Theses and Dissertations Master). Princess Sumaya University for Technology, Jordan
https://search.emarefa.net/detail/BIM-833198

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Ibrahim, Rasha Mahmud Ahmad. Sentiment analysis of Arabic tweets-implicit semantic-based approach. (Master's theses Theses and Dissertations Master). Princess Sumaya University for Technology. (2018).
https://search.emarefa.net/detail/BIM-833198

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Ibrahim, Rasha Mahmud Ahmad. (2018). Sentiment analysis of Arabic tweets-implicit semantic-based approach. (Master's theses Theses and Dissertations Master). Princess Sumaya University for Technology, Jordan
https://search.emarefa.net/detail/BIM-833198

لغة النص

الإنجليزية

نوع البيانات

رسائل جامعية

رقم السجل

BIM-833198