Study of different statistical machine learning techniques for text sentiment classification

Joint Authors

Taha, Abd al-Rahman N.
Abu al-Suud, Raniya Ahmad Abd al-Azim

Source

Fayoum University Journal of Engineering

Issue

Vol. 5, Issue 1 (31 Jan. 2022), pp.66-73, 8 p.

Publisher

Fayoum University Faculty of Engineering

Publication Date

2022-01-31

Country of Publication

Egypt

No. of Pages

8

Main Subjects

Electronic engineering

Topics

Abstract EN

Text classification is an important task in NLP for various applications from movie review classification to market analysis.

NLP as a tool provides the capability to process huge amount of text and come up with conclusions.

In this paper we inves-tigate statistical machine learning for NLP for document classification.

The target problem of choice is sentiment analysis, we explore various techniques for text pre-processing, feature selection and model selection to find a good fit model.

This paper acts as both a system proposal and also a primer for those who to start practicing NLP, we try to provide insight and intuition about modelling choices for text classi-fication that extend even beyond the task scope to general NLP.

In this paper we propose a feature based text sentiment analysis relying heavily of the BoN (Bag of N-grams) model and utilizing these features with a statistical ML classifier.

We use the IMDB movie review dataset (Maas et al.

2011) for benchmarking.

American Psychological Association (APA)

Taha, Abd al-Rahman N.& Abu al-Suud, Raniya Ahmad Abd al-Azim. 2022. Study of different statistical machine learning techniques for text sentiment classification. Fayoum University Journal of Engineering،Vol. 5, no. 1, pp.66-73.
https://search.emarefa.net/detail/BIM-1397864

Modern Language Association (MLA)

Taha, Abd al-Rahman N.& Abu al-Suud, Raniya Ahmad Abd al-Azim. Study of different statistical machine learning techniques for text sentiment classification. Fayoum University Journal of Engineering Vol. 5, no. 1 (2022), pp.66-73.
https://search.emarefa.net/detail/BIM-1397864

American Medical Association (AMA)

Taha, Abd al-Rahman N.& Abu al-Suud, Raniya Ahmad Abd al-Azim. Study of different statistical machine learning techniques for text sentiment classification. Fayoum University Journal of Engineering. 2022. Vol. 5, no. 1, pp.66-73.
https://search.emarefa.net/detail/BIM-1397864

Data Type

Journal Articles

Language

English

Notes

-

Record ID

BIM-1397864