Morpheme based language model for tamil speech recognition system

المؤلفون المشاركون

Saraswathi, selvarajan
Geetha, Thekkumpurath

المصدر

The International Arab Journal of Information Technology

العدد

المجلد 4، العدد 3 (31 يوليو/تموز 2007)، ص ص. 214-219، 6ص.

الناشر

جامعة الزرقاء

تاريخ النشر

2007-07-31

دولة النشر

الأردن

عدد الصفحات

6

التخصصات الرئيسية

تكنولوجيا المعلومات وعلم الحاسوب

الموضوعات

الملخص EN

This paper describes the design of a morpheme based language model for Tamil language.

It aims to alleviate the main problems encountered in processing the Tamil language, like enormous vocabulary growth caused by large number of different forms derived for one word.

The size of the vocabulary is reduced by decomposing the words into stems and endings and storing these sub word units (morphemes) for training the language model The modified morpheme based language model was applied to avoid the ambiguities in the recognized Tamil words.

The perplexity, Out Of Vocabulary (OOV) rate and Word Error Rate (WER) parameters were obtained to check the efficiency of the model for Tamil speech recognition system.

The results were compared with the traditional word based statistical bigram and trigram language models.

From the results, it was analyzed that the modified morpheme based trigram model with Katz back off smoothing effect improved the performance of the Tamil speech recognition system when compared to the word based N-Gram language models.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Saraswathi, selvarajan& Geetha, Thekkumpurath. 2007. Morpheme based language model for tamil speech recognition system. The International Arab Journal of Information Technology،Vol. 4, no. 3, pp.214-219.
https://search.emarefa.net/detail/BIM-11683

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Saraswathi, selvarajan& Geetha, Thekkumpurath. Morpheme based language model for tamil speech recognition system. The International Arab Journal of Information Technology Vol. 4, no. 3 (Jul. 2007), pp.214-219.
https://search.emarefa.net/detail/BIM-11683

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Saraswathi, selvarajan& Geetha, Thekkumpurath. Morpheme based language model for tamil speech recognition system. The International Arab Journal of Information Technology. 2007. Vol. 4, no. 3, pp.214-219.
https://search.emarefa.net/detail/BIM-11683

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

f

رقم السجل

BIM-11683