Comparative analysis of Arabic text compression techniques

Other Title(s)

تحليل مقارن بين تقنيات الضغط للنص العربي

Dissertant

al-Ajami, Talal M.

Thesis advisor

Kanaan, Ghassan

University

Amman Arab University

Faculty

Collage of Computer Sciences and Informatics

Department

Department of Computer Science

University Country

Jordan

Degree

Master

Degree Date

2016

English Abstract

Natural Language Text Compression techniques have been discussed thoroughly in the literature in the past decades.

Several methods have been implemented and introduced, however, most focused on English and European languages.

Rather few studies have focused on Arabic Text Compression, some used statistical approaches, others dictionary based approaches, while some used features of the Arabic language and derivation rules to create a certain methodology to increase compression ratio.

In this thesis, we discussed the different characteristics of the Arabic language and introduced several statistical methods for text compression and adapt them for Arabic text stored in Unicode files.

We provided implementation for each method and presented a comparison between them in terms of performance, compression ratio, resources requirements to run the algorithms and areas of usage and employment.

The algorithms under investigation are Golomb Code, Elias Gamma Code, Huffman code and LZW methods.

We also suggested an Arabic Character Mapping technique to be used in the previous four algorithms, which showed a major improvement to the compression ratio and compare the different algorithms amongst each other with the enhanced algorithms that utilize the mapping.

Main Subjects

Information Technology and Computer Science

No. of Pages

110

Table of Contents

Table of contents.

Abstract.

Abstract in Arabic.

Chapter One : References.

Chapter Two : Literature reviews .

Chapter Three : Arabic language characteristics and feature.

Chapter Four : Research methodology.

Chapter Five : Implementation.

Chapter Six : Evaluation and results.

Chapter Seven : Conclusion and future work.

References.

American Psychological Association (APA)

al-Ajami, Talal M.. (2016). Comparative analysis of Arabic text compression techniques. (Master's theses Theses and Dissertations Master). Amman Arab University, Jordan
https://search.emarefa.net/detail/BIM-932926

Modern Language Association (MLA)

al-Ajami, Talal M.. Comparative analysis of Arabic text compression techniques. (Master's theses Theses and Dissertations Master). Amman Arab University. (2016).
https://search.emarefa.net/detail/BIM-932926

American Medical Association (AMA)

al-Ajami, Talal M.. (2016). Comparative analysis of Arabic text compression techniques. (Master's theses Theses and Dissertations Master). Amman Arab University, Jordan
https://search.emarefa.net/detail/BIM-932926

Language

English

Data Type

Arab Theses

Record ID

BIM-932926