Modern standard Arabic grammar automatic extraction from Penn Arabic Treebank using natural language toolkit
Other Title(s)
استخلاص قواعد النحو لتراكيب جمل اللغة العربية المعاصرة آليا باستخدام عينة لغوية من Penn Arabic Treebank باستخدام NLTK
Joint Authors
Abd al-Halim, Amirah
al-Ansari, Samih
Source
The Egyptian Journal of Language Engineering
Issue
Vol. 5, Issue 1 (30 Apr. 2018), pp.1-10, 10 p.
Publisher
Egyptian Society of Language Engineering
Publication Date
2018-04-30
Country of Publication
Egypt
No. of Pages
10
Main Subjects
Information Technology and Computer Science
Abstract EN
This paper presents a methodology for rule based bottom up parsing technique forModern Standard Arabic (MSA) in Context Free Grammar (CFG) formalism in Phrase Structure Grammar (PSG) representation, where the grammar is automatically extracted from a syntactically annotated corpus.The extracted grammar is used to build an automatic lexicon and grammar rules module.
Furthermore, the extracted CFG is further transformed into Probabilistic Context Free Grammar (PCFG) that could be used in a hybrid approach, which is also calculated automatically.
The used corpus is the Penn Arabic Treebank(PATB)and algorithm implementation is performed with Natural Language Processing Toolkit (NLTK).The parser showed that automatic extraction of grammar improved the grammar building phase in both coverage of structures and time needed, but still needs further manual constrains addition.
Automatic extraction of grammar is able to enhance rule based grammar parsers and it will enable a new paradigm of statistically directed symbolic parsing.
American Psychological Association (APA)
Abd al-Halim, Amirah& al-Ansari, Samih. 2018. Modern standard Arabic grammar automatic extraction from Penn Arabic Treebank using natural language toolkit. The Egyptian Journal of Language Engineering،Vol. 5, no. 1, pp.1-10.
https://search.emarefa.net/detail/BIM-941786
Modern Language Association (MLA)
Abd al-Halim, Amirah& al-Ansari, Samih. Modern standard Arabic grammar automatic extraction from Penn Arabic Treebank using natural language toolkit. The Egyptian Journal of Language Engineering Vol. 5, no. 1 (Apr. 2018), pp.1-10.
https://search.emarefa.net/detail/BIM-941786
American Medical Association (AMA)
Abd al-Halim, Amirah& al-Ansari, Samih. Modern standard Arabic grammar automatic extraction from Penn Arabic Treebank using natural language toolkit. The Egyptian Journal of Language Engineering. 2018. Vol. 5, no. 1, pp.1-10.
https://search.emarefa.net/detail/BIM-941786
Data Type
Journal Articles
Language
English
Notes
Record ID
BIM-941786