Middle Eastern and North African English Speech Corpus (MENAESC)‎ : middle Eastern and North African English Speech Corpus (MENAESC)‎: Automatic Identification of MENA English Accents

Joint Authors

Ahfir, Maamar
Chellali, Sara
Kenai, Ouassila
Hidouci, Walid
al-Maadeed, Somaya

Source

The International Arab Journal of Information Technology

Issue

Vol. 18, Issue 1 (31 Jan. 2021), pp.67-76, 10 p.

Publisher

Zarqa University Deanship of Scientific Research

Publication Date

2021-01-31

Country of Publication

Jordan

No. of Pages

10

Main Subjects

Information Technology and Computer Science

Abstract EN

This study aims to explore the English accents in the Arab world.

Although there are limited resources for a speech corpus that attempts to automatically identify the degree of accent patterns of an Arabic speaker of English, there is no speech corpus specialized for Arabic speakers of English in the Middle East and North Africa (MENA).

To that end, different samples were collected in order to create the linguistic resource that we called Middle Eastern and North African English Speech Corpus (MENAESC).

In addition to the “accent approach” applied in the field of automatic language/dialect recognition; we applied also the “macro-accent approach” -by employing Mel-Frequency Cepstral Coefficients (MFCC), Energy and Shifted Delta Cepstra (SDC) features and Gaussian Mixture Model-Universal Background Model (GMM-UBM) classifier- on four accents (Egyptian, Qatari, Syrian, and Tunisian accents) among the eleven accents that were selected based on their high population density in the location where the experiments were carried out.

By using the Equal Error Rate percentage (EER%) for the assessment of our system effectiveness in the identification of MENA English accents using the two approaches mentioned above through the employ of the MENAESC, results showed we reached 1.5 to 2%, for “accent approach” and 2 to 3.5% for “macro-accents approach” for identification of MENA English.

It also exhibited that the Qatari accent, of the 4 accents included, scored the lowest EER% for all tests performed.

Taken together, the system effectiveness is not only affected by the approaches used, but also by the database size MENAESC and its characteristics.

Moreover, it is impacted by the proficiency of the Arabic speakers of English and the influence of their mother tongue.

American Psychological Association (APA)

Chellali, Sara& al-Maadeed, Somaya& Kenai, Ouassila& Ahfir, Maamar& Hidouci, Walid. 2021. Middle Eastern and North African English Speech Corpus (MENAESC) : middle Eastern and North African English Speech Corpus (MENAESC): Automatic Identification of MENA English Accents. The International Arab Journal of Information Technology،Vol. 18, no. 1, pp.67-76.
https://search.emarefa.net/detail/BIM-1430999

Modern Language Association (MLA)

Chellali, Sara…[et al.]. Middle Eastern and North African English Speech Corpus (MENAESC) : middle Eastern and North African English Speech Corpus (MENAESC): Automatic Identification of MENA English Accents. The International Arab Journal of Information Technology Vol. 18, no. 1 (Jan. 2021), pp.67-76.
https://search.emarefa.net/detail/BIM-1430999

American Medical Association (AMA)

Chellali, Sara& al-Maadeed, Somaya& Kenai, Ouassila& Ahfir, Maamar& Hidouci, Walid. Middle Eastern and North African English Speech Corpus (MENAESC) : middle Eastern and North African English Speech Corpus (MENAESC): Automatic Identification of MENA English Accents. The International Arab Journal of Information Technology. 2021. Vol. 18, no. 1, pp.67-76.
https://search.emarefa.net/detail/BIM-1430999

Data Type

Journal Articles

Language

English

Notes

Text in English ; abstracts in .

Record ID

BIM-1430999