Dynamic Arabic text compression technique
Other Title(s)
تقنية مرنة لضغط النصوص العربية
Dissertant
Abu Jaffal, Dirgham Abd al-Jabbar A.
Thesis advisor
Comitee Members
Umar, Khamis
al-Atum, Jalal
Khatatinah, Khalaf
University
Princess Sumaya University for Technology
Faculty
King Hussein Faculty for Computing Sciences
Department
Department of Computer Sciences
University Country
Jordan
Degree
Master
Degree Date
2014
English Abstract
Compression techniques have importance in several areas especially in communications and information technology in order to reduce the size of data and to decrease the data transmission time between computers and over the networks without any distortion.
In addition, text compression techniques aim using the compressed text in text oriented application such as searching, summarizing, and information retrieval. The importance‟s of this thesis are to build a new technique with more efficient tools to reduce the volume of data to be transmitted, decrease space, time to transmit, cost and to reduce storage requirements and get a high compression ratio. In this work, the morphological feature of Arabic language was used to build a new dynamic technique of Arabic Text Compression that satisfies two objectives: good compression ratio and to use the compressed text structure to extract information from the compressed file instead of the original file.
In addition to reducing the text size and increasing the transmission speeds. Different compression techniques Lempel–Ziv-Welch (LZW) and Adaptive Huffman (Huffman) were tested on Arabic texts and their results were compared in terms of compression ratio and compression time ratio.
LZW was the best one for all categories of the Arabic texts (vowelized, partial vowelized, non-vowelized), then Huffman techniques.
Features of the Arabic language were studied and then exploited to improve the performance of these techniques as a pre-processing step for data compression. As a result of this study, a new dynamic technique has been suggested with better results in term of compression rate and text researchable files.
This technique works in phases.
In the first phase, the text file is split into four different files using a Multilayer analyzer.
In the second phase, the merge character model was used for encoding in the third layer, each one of these four files is compressed using Huffman and LZW.
LZW technique was found to be suitable for all text files in terms of compression ratio.
The integration between the Multilayer model and merge character model with LZW give us a better compression ratio for all files and reduce the compression time.
Main Subjects
Topics
No. of Pages
71
Table of Contents
Table of contents.
Abstract.
Chapter One : Introduction.
Chapter Two : Related literature and theoretical focus.
Chapter Three : Framework design and implementation.
Chapter Four : The proposed compression technique.
Chapter Five : Experimental evaluation.
Chapter Six : Conclusion and future work.
References.
American Psychological Association (APA)
Abu Jaffal, Dirgham Abd al-Jabbar A.. (2014). Dynamic Arabic text compression technique. (Master's theses Theses and Dissertations Master). Princess Sumaya University for Technology, Jordan
https://search.emarefa.net/detail/BIM-413809
Modern Language Association (MLA)
Abu Jaffal, Dirgham Abd al-Jabbar A.. Dynamic Arabic text compression technique. (Master's theses Theses and Dissertations Master). Princess Sumaya University for Technology. (2014).
https://search.emarefa.net/detail/BIM-413809
American Medical Association (AMA)
Abu Jaffal, Dirgham Abd al-Jabbar A.. (2014). Dynamic Arabic text compression technique. (Master's theses Theses and Dissertations Master). Princess Sumaya University for Technology, Jordan
https://search.emarefa.net/detail/BIM-413809
Language
English
Data Type
Arab Theses
Record ID
BIM-413809