A rule-based approach for tagging non-vocalized Arabic words
Joint Authors
al-Taani, Ahmad T.
Abu al-Rubb, Salah
Source
The International Arab Journal of Information Technology
Issue
Vol. 6, Issue 3 (31 Jul. 2009), pp.320-328, 9 p.
Publisher
Publication Date
2009-07-31
Country of Publication
Jordan
No. of Pages
9
Main Subjects
Information Technology and Computer Science
Topics
Abstract EN
In this work, we present a tagging system which classifies the words in a non-vocalized Arabic text to their tags.
The proposed tagging system passes through three levels of analysis.
The first level is a lexical analyzer that composed of a lexicon containing all fixed words and particles such as prepositions and pronouns.
The second level is a morphological analyzer which relies on word structure using patterns and affixes to determine word class.
The third level is a syntax analyzer or a grammatical tagging which relies on the process of assigning grammatical tags to words based on their context or the position of the word in the sentence.
The syntax analyzer level consists of two stages : the first stage depends on specific keywords that inform the tag of the successive word, the second stage is the reversed parsing technique which scans the available grammars of Arabic language to get the class of a single ambiguity word in the sentence.
We have tested the proposed system on a corpus consists of 2355 words.
Experimental results showed that the proposed system achieved a rate of success approaching 94% of the total number of words in the sample used in the study.
American Psychological Association (APA)
al-Taani, Ahmad T.& Abu al-Rubb, Salah. 2009. A rule-based approach for tagging non-vocalized Arabic words. The International Arab Journal of Information Technology،Vol. 6, no. 3, pp.320-328.
https://search.emarefa.net/detail/BIM-10441
Modern Language Association (MLA)
al-Taani, Ahmad T.& Abu al-Rubb, Salah. A rule-based approach for tagging non-vocalized Arabic words. The International Arab Journal of Information Technology Vol. 6, no. 3 (Jul. 2009), pp.320-328.
https://search.emarefa.net/detail/BIM-10441
American Medical Association (AMA)
al-Taani, Ahmad T.& Abu al-Rubb, Salah. A rule-based approach for tagging non-vocalized Arabic words. The International Arab Journal of Information Technology. 2009. Vol. 6, no. 3, pp.320-328.
https://search.emarefa.net/detail/BIM-10441
Data Type
Journal Articles
Language
English
Notes
Includes bibliographical references : p. 326-327
Record ID
BIM-10441