Template based affix stemmer for a morphologically rich language
Joint Authors
Anwar, Muhammad Nadim
Bajwa, Usamah
Xuan, Wang
Khan, Sajjad
Source
The International Arab Journal of Information Technology
Issue
Vol. 12, Issue 2 (31 Mar. 2015)9 p.
Publisher
Publication Date
2015-03-31
Country of Publication
Jordan
No. of Pages
9
Main Subjects
Information Technology and Computer Science
Topics
Abstract EN
Word stemming is one of the most significant factors that affect the performance of a Natural Language Processing (NLP) application such as information retrieval system, part of speech tagging, machine translation system and syntactic parsing.
Urdu language raises several challenges to NLP largely due to its rich morphology.
In Urdu language, stemming process is different as compared to that for other languages, as it not only depends on removing prefixes and suffixes but also on removing infixes.
In this paper we introduce a template based stemmer that eliminates all kinds of affixes i.e.
prefixes, infixes and suffixes, depending on the morphological pattern of the word.
The presented results are excellent and this stemmer can prove to be very affective for a morphologically rich language.
American Psychological Association (APA)
Khan, Sajjad& Anwar, Muhammad Nadim& Bajwa, Usamah& Xuan, Wang. 2015. Template based affix stemmer for a morphologically rich language. The International Arab Journal of Information Technology،Vol. 12, no. 2.
https://search.emarefa.net/detail/BIM-368816
Modern Language Association (MLA)
Khan, Sajjad…[et al.]. Template based affix stemmer for a morphologically rich language. The International Arab Journal of Information Technology Vol. 12, no. 2 (Mar. 2015).
https://search.emarefa.net/detail/BIM-368816
American Medical Association (AMA)
Khan, Sajjad& Anwar, Muhammad Nadim& Bajwa, Usamah& Xuan, Wang. Template based affix stemmer for a morphologically rich language. The International Arab Journal of Information Technology. 2015. Vol. 12, no. 2.
https://search.emarefa.net/detail/BIM-368816
Data Type
Journal Articles
Language
English
Notes
Includes bibliographical references.
Record ID
BIM-368816