Dynamic Arabic text compression technique

Other Title(s)

تقنية مرنة لضغط النصوص العربية

Dissertant

Abu Jaffal, Dirgham Abd al-Jabbar A.

Thesis advisor

Ujan, Arafat

Comitee Members

Umar, Khamis
al-Atum, Jalal
Khatatinah, Khalaf

University

Princess Sumaya University for Technology

Faculty

King Hussein Faculty for Computing Sciences

Department

Department of Computer Sciences

University Country

Jordan

Degree

Master

Degree Date

2014

English Abstract

Compression techniques have importance in several areas especially in communications and information technology in order to reduce the size of data and to decrease the data transmission time between computers and over the networks without any distortion.

In addition, text compression techniques aim using the compressed text in text oriented application such as searching, summarizing, and information retrieval. The importance‟s of this thesis are to build a new technique with more efficient tools to reduce the volume of data to be transmitted, decrease space, time to transmit, cost and to reduce storage requirements and get a high compression ratio. In this work, the morphological feature of Arabic language was used to build a new dynamic technique of Arabic Text Compression that satisfies two objectives: good compression ratio and to use the compressed text structure to extract information from the compressed file instead of the original file.

In addition to reducing the text size and increasing the transmission speeds. Different compression techniques Lempel–Ziv-Welch (LZW) and Adaptive Huffman (Huffman) were tested on Arabic texts and their results were compared in terms of compression ratio and compression time ratio.

LZW was the best one for all categories of the Arabic texts (vowelized, partial vowelized, non-vowelized), then Huffman techniques.

Features of the Arabic language were studied and then exploited to improve the performance of these techniques as a pre-processing step for data compression. As a result of this study, a new dynamic technique has been suggested with better results in term of compression rate and text researchable files.

This technique works in phases.

In the first phase, the text file is split into four different files using a Multilayer analyzer.

In the second phase, the merge character model was used for encoding in the third layer, each one of these four files is compressed using Huffman and LZW.

LZW technique was found to be suitable for all text files in terms of compression ratio.

The integration between the Multilayer model and merge character model with LZW give us a better compression ratio for all files and reduce the compression time.

Main Subjects

Mathematics

Topics

No. of Pages

71

Table of Contents

Table of contents.

Abstract.

Chapter One : Introduction.

Chapter Two : Related literature and theoretical focus.

Chapter Three : Framework design and implementation.

Chapter Four : The proposed compression technique.

Chapter Five : Experimental evaluation.

Chapter Six : Conclusion and future work.

References.

American Psychological Association (APA)

Abu Jaffal, Dirgham Abd al-Jabbar A.. (2014). Dynamic Arabic text compression technique. (Master's theses Theses and Dissertations Master). Princess Sumaya University for Technology, Jordan
https://search.emarefa.net/detail/BIM-413809

Modern Language Association (MLA)

Abu Jaffal, Dirgham Abd al-Jabbar A.. Dynamic Arabic text compression technique. (Master's theses Theses and Dissertations Master). Princess Sumaya University for Technology. (2014).
https://search.emarefa.net/detail/BIM-413809

American Medical Association (AMA)

Abu Jaffal, Dirgham Abd al-Jabbar A.. (2014). Dynamic Arabic text compression technique. (Master's theses Theses and Dissertations Master). Princess Sumaya University for Technology, Jordan
https://search.emarefa.net/detail/BIM-413809

Language

English

Data Type

Arab Theses

Record ID

BIM-413809