Comprehensive processing for Arabic texts to extract their roots

Other Title(s)

معالجة شاملة للنصوص العربية لاستخلاص الجذور الصحيحة

Author

Abu Salih, Bilal

Source

Iraqi Journal of Science

Issue

Vol. 60, Issue 6 (30 Jun. 2019), pp.1404-1411, 8 p.

Publisher

University of Baghdad College of Science

Publication Date

2019-06-30

Country of Publication

Iraq

No. of Pages

8

Main Subjects

Information Technology and Computer Science
Arabic language and Literature

Topics

Abstract EN

Arabic language is a highly inflectional language where a single word can have different forms using a single root with different interpretations.

Arabic does not have a standard way to find roots, the reasons for having inflectional language: suffix, prefix and infix Vowels, which built in complex processes.

That is why, words require good processing for information retrieval solutions, until now, and there has been no standard approach to attaining the fully proper root.

The applications on Arabic words show around 99% are derived from a combination of bilateral, Trilateral and quad lateral roots.

Processing word- stemming levels in order to extract a root is the process of removing all additional affixes.

In case the process of matching between a word and Proper names is available, take off the affixes away, according to patterns and rules with reference to root dictionaries.

This research is new series of steps using a new way of affixes' browsing, vowels and Patterns through three stages of stemming.

I f a match is not found, vowel replacement and patterns readjusted to check, if not, then the word is kept unmodified.

Search engine, indexing, file classification, clustering etc.

need developing the root extraction, where the researcher will introduce recommendations and solutions that participate in improving Arabic root extraction.

Research applies comprehensive processing on general collection of documents that done gradually to improve the root extraction by 96%.

American Psychological Association (APA)

Abu Salih, Bilal. 2019. Comprehensive processing for Arabic texts to extract their roots. Iraqi Journal of Science،Vol. 60, no. 6, pp.1404-1411.
https://search.emarefa.net/detail/BIM-969413

Modern Language Association (MLA)

Abu Salih, Bilal. Comprehensive processing for Arabic texts to extract their roots. Iraqi Journal of Science Vol. 60, no. 6 (2019), pp.1404-1411.
https://search.emarefa.net/detail/BIM-969413

American Medical Association (AMA)

Abu Salih, Bilal. Comprehensive processing for Arabic texts to extract their roots. Iraqi Journal of Science. 2019. Vol. 60, no. 6, pp.1404-1411.
https://search.emarefa.net/detail/BIM-969413

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references : p. 1411

Record ID

BIM-969413