ANMorph-Amazigh nouns morphological analyzer morphological analyzer for nouns in the Amazigh language

Dissertant

Raiss, Hanai

Thesis advisor

Cavalli Sforza, Violetta

University

Al Akhawayn University

Faculty

School of Science and Engineering

Department

Software Engineering

University Country

Morocco

Degree

Master

Degree Date

2012

English Abstract

Language Processing (NLP) is an important field that aims to develop a representation of natural language that can be manipulated by a computer to perform specific tasks.

Lately, a lot of interest has been given worldwide to processing natural languages, but the Amazigh language, one of the official languages in Morocco since July 2011, which is spoken by approximately 50% of the Moroccan population (Ameur, et al.

2010), has not benefited much from computational research, so only few tools are available for processing this language.

From this comes the motivation to focus on the area of NLP with a special reference to the Amazigh language.

This paper presents ANMorph, a two level morphological analyzer for Amazigh language nouns.

ANMorph is not fully complete as it is still in development.

Further development and evaluation must be done to make it a complete morphological analyzer for Amazigh nouns.

ANMorph will be of great importance to other NLP tools that will be developed for the Amazigh language processing.

The ANMorph project achieves the following objectives:  Building a corpus of Amazigh nouns  Writing rules for Amazigh nouns morphology  Developing a morphological analyzer to analyze Amazigh nouns The main aim of ANMorph is to automate morphological analysis of the Amazigh language and to provide a basic system for input to other NLP tools.

ANMorph is implemented using Stuttgart Finite State Transducer (SFST), which is an open-source platform that provides finite state transducer tools.

Building ANMorph started with some data processing followed by a well defined process in SFST to build the morphological analyzer for nouns in the Amazigh language.

In this project, we built a lexicon that consists of 1541 nouns divided into sub lexicons: Free nouns, proper nouns, borrowed words, kinship nouns and non inflected nouns.

The result of lexicon analysis using SFST is very promising as only 9.32% of the nouns are incorrectly analyzed by ANMorph.

With regards to future work, the main points to be addressed are embedding other parts of speech (verbs, adjectives, adverbs) in the morphological analyzer as well as including words from other dialects (Tamazight, and Tashelhit).

Main Subjects

Languages & Comparative Literature
Information Technology and Computer Science

Topics

No. of Pages

100

Table of Contents

Table of contents.

Abstract.

Abstract in Arabic.

Introduction.

Chapter One : Natural language processing-NLP.

Chapter Two : The Amazigh language.

Chapter Three : The Stuttgart finite state transducer-SFST.

Chapter Four : Data description.

Chapter Five : Building ANMorph using SFST.

Chapter Six : Analysis of result, conclusions and future work.

References.

American Psychological Association (APA)

Raiss, Hanai. (2012). ANMorph-Amazigh nouns morphological analyzer morphological analyzer for nouns in the Amazigh language. (Master's theses Theses and Dissertations Master). Al Akhawayn University, Morocco
https://search.emarefa.net/detail/BIM-647373

Modern Language Association (MLA)

Raiss, Hanai. ANMorph-Amazigh nouns morphological analyzer morphological analyzer for nouns in the Amazigh language. (Master's theses Theses and Dissertations Master). Al Akhawayn University. (2012).
https://search.emarefa.net/detail/BIM-647373

American Medical Association (AMA)

Raiss, Hanai. (2012). ANMorph-Amazigh nouns morphological analyzer morphological analyzer for nouns in the Amazigh language. (Master's theses Theses and Dissertations Master). Al Akhawayn University, Morocco
https://search.emarefa.net/detail/BIM-647373

Language

English

Data Type

Arab Theses

Record ID

BIM-647373