Hybrid technique for Arabic text compression
العناوين الأخرى
تقنية هجينة لضغط النصوص العربية
مقدم أطروحة جامعية
مشرف أطروحة جامعية
أعضاء اللجنة
الجامعة
جامعة الشرق الأوسط
الكلية
كلية تكنولوجيا المعلومات
القسم الأكاديمي
قسم علم الحاسوب
دولة الجامعة
الأردن
الدرجة العلمية
ماجستير
تاريخ الدرجة العلمية
2013
الملخص الإنجليزي
Compression techniques have gained great importance in the field of communications and information technology in order to reduce the growing size of data and to increase the data transmission speed between computers and over the networks.
In addition to these aims, text compression techniques aim at using the compressed text in text oriented application such as searching, summarizing, and information retrieval.
In this work, the morphological feature of Arabic language was used to build a new technique of Arabic Text Compression that satisfies two objectives: obtaining good compression rate and having compressed text structure that can be used to extract information from the compressed file instead of the original file.
In addition to reducing the text size and increasing the transmission speed, these techniques speed up the information extraction, terms of searching processes.
Different common compression techniques Lempel–Ziv-Welch (LZW) and Burrows Wheeler Transform (BWT) were tested on Arabic texts and their results were compared in term of compression ratio.
LZW was the best one for all categories of the Arabic texts, then BWT techniques.
Features of the Arabic language were studied and then exploited to improve the performance of these techniques.
The fact that Arabic letters have a single case was used to improve the performance of LZW.
Through exploitation of the unused locations of the dictionary, the results showed that the compression ratio for the proposed method was better than all the other techniques.
The morphological features of the Arabic language had been used as a pre-processing step for data compression.
As a result of this study, a new hybrid technique has been suggested with better results in term of compression rate and text researchable files.
This technique works in phases.
In the first phase, the text file is split into four different files using a Multilayer analyzer.
In the second phase, each one of these four files is compressed using BWT.
Different compression techniques were investigated and tested at the level of each one of the four files.
BWT technique was found to be suitable for all text files in terms compression ratio.
The integration of the Multilayer model with LZW to compress all the files reduced the compression time.
التخصصات الرئيسية
تكنولوجيا المعلومات وعلم الحاسوب
عدد الصفحات
69
قائمة المحتويات
Table of contents.
Abstract.
Abstract in Arabic.
Chapter One : Introduction.
Chapter Two : Literature survey.
Chapter Three : Framework design and implementation.
Chapter Four : Experimental evaluation.
Chapter Five : Conclusion and future work.
References.
نمط استشهاد جمعية علماء النفس الأمريكية (APA)
Abu Jrai, Inas Mahmud. (2013). Hybrid technique for Arabic text compression. (Master's theses Theses and Dissertations Master). Middle East University, Jordan
https://search.emarefa.net/detail/BIM-699373
نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)
Abu Jrai, Inas Mahmud. Hybrid technique for Arabic text compression. (Master's theses Theses and Dissertations Master). Middle East University. (2013).
https://search.emarefa.net/detail/BIM-699373
نمط استشهاد الجمعية الطبية الأمريكية (AMA)
Abu Jrai, Inas Mahmud. (2013). Hybrid technique for Arabic text compression. (Master's theses Theses and Dissertations Master). Middle East University, Jordan
https://search.emarefa.net/detail/BIM-699373
لغة النص
الإنجليزية
نوع البيانات
رسائل جامعية
رقم السجل
BIM-699373
قاعدة معامل التأثير والاستشهادات المرجعية العربي "ارسيف Arcif"
أضخم قاعدة بيانات عربية للاستشهادات المرجعية للمجلات العلمية المحكمة الصادرة في العالم العربي
تقوم هذه الخدمة بالتحقق من التشابه أو الانتحال في الأبحاث والمقالات العلمية والأطروحات الجامعية والكتب والأبحاث باللغة العربية، وتحديد درجة التشابه أو أصالة الأعمال البحثية وحماية ملكيتها الفكرية. تعرف اكثر