Sentiment analysis of Arabic tweets-implicit semantic-based approach

Other Title(s)

تحليل المشاعر و الآراء لتغريدات (Tweets)‎ باعتماد أساس المعنى الضمني (المجازي)‎ للآراء

Dissertant

Ibrahim, Rasha Mahmud Ahmad

Thesis advisor

Ujan, Arafat
al-Kuz, Akram

University

Princess Sumaya University for Technology

Faculty

King Hussein Faculty for Computing Sciences

Department

Department of Computer Sciences

University Country

Jordan

Degree

Master

Degree Date

2018

English Abstract

Sentiment analysis is one of the natural language processing tasks aimed at automatically extracting writer opinion and detecting polarity from the written text.

Most studies have focused on sentiment analysis in English as a primary language on the Internet, while very few studies have addressed sentiment analysis in Arabic.

After the rapid growth of web content and the increasing number of Arabic users on the Internet, especially on social media sites, Arabic sentiment analysis gained the attention of researchers and motivated them to study and initiate tools that can identify and classify opinions that are represented on the Internet.

Several approaches to extract sentiment from text are available, most of which use lexical-based analysis or what is called a dictionary-based approach, which begins with a manual, predefined dictionary of positive and negative words and then uses word counts or other measures and frequency to score all the opinions in the data.

Lately, machine learning-based sentiment analysis has been used to extract polarity from text using unsupervised learning, supervised learning, or a hybrid approach.

The objectives of this thesis are to use a large corpus to build larger dictionaries in order to increase the accuracy of the classifier and to try to extract the polarity, depending on the words’ meanings, while considering that the same word can have many meanings in different contexts.

This thesis will describe a new approach for sentiment analysis in Arabic.

The approach used is Word2Vec, which is a group of related models that are used to produce Word Embeddings.

These models are shallow, two-layer neural networks that are trained to reconstruct the linguistic contexts of words.

Word2Vec takes as its input a large corpus of text—which are tweets in this study—and produces a vector space, typically of several hundred dimensions, with each unique word in the corpus being assigned a corresponding vector in the space.

Word vectors are positioned in the vector space such that words sharing common contexts in the corpus are located in close proximity to one another in the space.

The proposed architecture using the Word Embeddings technique and Neural Network classifier were promising and indicated that using such hybrid approaches affects the accuracy of extracting sentiment from Arabic text with implicit meaning.

Finally, the experiments and evaluations that were conducted in this thesis encourage the researchers to continue in this direction of research.

The performance of the proposed approach compared with other Hybrid Machine Learning methods achieved 93.1% accuracy.

Keywords:

Main Subjects

Information Technology and Computer Science

Topics

No. of Pages

79

Table of Contents

Table of contents.

Abstract.

Abstract in Arabic.

Chapter One : Introduction.

Chapter Two : Background and related work.

Chapter Three : Methodology

Chapter Four : Experiments and results.

Chapter Five : Evaluations.

Chapter Six : Conclusion and future work.

References.

American Psychological Association (APA)

Ibrahim, Rasha Mahmud Ahmad. (2018). Sentiment analysis of Arabic tweets-implicit semantic-based approach. (Master's theses Theses and Dissertations Master). Princess Sumaya University for Technology, Jordan
https://search.emarefa.net/detail/BIM-833198

Modern Language Association (MLA)

Ibrahim, Rasha Mahmud Ahmad. Sentiment analysis of Arabic tweets-implicit semantic-based approach. (Master's theses Theses and Dissertations Master). Princess Sumaya University for Technology. (2018).
https://search.emarefa.net/detail/BIM-833198

American Medical Association (AMA)

Ibrahim, Rasha Mahmud Ahmad. (2018). Sentiment analysis of Arabic tweets-implicit semantic-based approach. (Master's theses Theses and Dissertations Master). Princess Sumaya University for Technology, Jordan
https://search.emarefa.net/detail/BIM-833198

Language

English

Data Type

Arab Theses

Record ID

BIM-833198