Topic modeling of phonetic Latin-spelled Arabic for the relative analysis of genre-dependent and dialect-dependent variation

Joint Authors

Saqr, Ali
Hasegawa Johnson, Mark

Source

RIST

Issue

Vol. 20, Issue 2 (31 Dec. 2013), pp.69-77, 9 p.

Publisher

Research Centre for Scientific and Technical Information

Publication Date

2013-12-31

Country of Publication

Algeria

No. of Pages

9

Main Subjects

Information Technology and Computer Science

Topics

Abstract EN

We demonstrate a data collection and analysis system that can be used to analyze the relative contributions of dialect dependent variation in the lexical of speech-like Arabic text.

We utilize Latent dirichlet Allocation (LDA), a generative Probabilistic modeling method, to analyze a phonetic Latin spelled Arabic online chat corpus.

The corpus produces different word choices and word relations based on Dialect, which can therefore aid in producing written forms of Arabic Dialects despite the large difference between standard written Arabic and the many Arabic dialects.-

Abstract FRE

Nous présenterons un système de collecte et d’analyse de données éventuellement utilisé pour analyser les contributions relatives des variations dépendantes au dialecte dans la sphère lexicale d’un texte semblable à l’écriture arabe.

De ce fait, nous aurons recours à l’allocation de Dirichlet latente (LDA), une méthode de modélisation générative probabiliste afin d’analyser la phonétique des termes arabes écrits en caractère latin extraits d’un corpus de discussion en ligne.

Ce corpus produit différents choix de mots et différentes relations conceptuelles basée sur le dialecte et qui par conséquent contribue à la reproduction graphique des termes arabes issus du dialecte malgré la large distinction existante entre l’arabe écrit standard et les nombreux dialectes arabes.

American Psychological Association (APA)

Saqr, Ali& Hasegawa Johnson, Mark. 2013. Topic modeling of phonetic Latin-spelled Arabic for the relative analysis of genre-dependent and dialect-dependent variation. RIST،Vol. 20, no. 2, pp.69-77.
https://search.emarefa.net/detail/BIM-427131

Modern Language Association (MLA)

Saqr, Ali& Hasegawa Johnson, Mark. Topic modeling of phonetic Latin-spelled Arabic for the relative analysis of genre-dependent and dialect-dependent variation. RIST Vol. 20, no. 2 (2013), pp.69-77.
https://search.emarefa.net/detail/BIM-427131

American Medical Association (AMA)

Saqr, Ali& Hasegawa Johnson, Mark. Topic modeling of phonetic Latin-spelled Arabic for the relative analysis of genre-dependent and dialect-dependent variation. RIST. 2013. Vol. 20, no. 2, pp.69-77.
https://search.emarefa.net/detail/BIM-427131

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references : p. 77

Record ID

BIM-427131