Enhancement quality and accuracy of speech recognition system using multimodal audio-visual speech signal
العناوين الأخرى
تحسين جودة ودقة أنظمة التعرف على الكلام باستخدام اشارة الكلام الصوتية والبصرية
المؤلفون المشاركون
al-Maghribi, Islam Id Ali Muhammad
Judi, Amr Muhammad Rifat
Faruq, Hisham Muhammad
المصدر
The Egyptian Journal of Language Engineering
العدد
المجلد 4، العدد 2 (30 سبتمبر/أيلول 2017)، ص ص. 27-40، 14ص.
الناشر
تاريخ النشر
2017-09-30
دولة النشر
مصر
عدد الصفحات
14
التخصصات الرئيسية
الملخص EN
Most developments in speech-based automatic recognition have relied on acoustic speech as the sole input signal, disregarding its visual counterpart.
However, recognition based on acoustic speech alone can be afflicted with deficiencies that prevent its use in many real-world applications, particularly under adverse conditions.
The combination of auditory and visual modalities promises higher recognition accuracy and robustness than can be obtained with a single modality.
Multimodal recognition is therefore acknowledged as a vital component of the next generation of spoken language systems.
This paper aims to build a connected-words audio visual speech recognition system (AV-ASR) for English language that uses both acoustic and visual speech information to improve the recognition performance.
Initially, Mel frequency cepstral coefficients (MFCCs) have been used to extract the audio features from the speech-files.
For the visual counterpart, the Discrete Cosine Transform (DCT) Coefficients have been used to extract the visual feature from the speaker's mouth region and Principle Component Analysis (PCA) have been used for dimensionality reduction purpose.
These features are then concatenated with traditional audio ones, and the resulting features are used for training hidden Markov models (HMMs) parameters using word level acoustic models.
The system has been developed using hidden Markov model toolkit (HTK) that uses hidden Markov models (HMMs) for recognition.
The potential of the suggested approach is demonstrated by a preliminary experiment on the GRID sentence database one of the largest databases available for audio-visual recognition system, which contains continuous English voice commands for a small vocabulary task.
The experimental results show that the proposed Audio Video Speech Recognizer (AV-ASR) system exhibits higher recognition rate in comparison to an audio-only recognizer as well as it indicates robust performance.
An increase of success rate by 4% for the grammar based word recognition system overall speakers is achieved for speaker independent test.
نمط استشهاد جمعية علماء النفس الأمريكية (APA)
al-Maghribi, Islam Id Ali Muhammad& Judi, Amr Muhammad Rifat& Faruq, Hisham Muhammad. 2017. Enhancement quality and accuracy of speech recognition system using multimodal audio-visual speech signal. The Egyptian Journal of Language Engineering،Vol. 4, no. 2, pp.27-40.
https://search.emarefa.net/detail/BIM-942185
نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)
al-Maghribi, Islam Id Ali Muhammad…[et al.]. Enhancement quality and accuracy of speech recognition system using multimodal audio-visual speech signal. The Egyptian Journal of Language Engineering Vol. 4, no. 2 (Sep. 2017), pp.27-40.
https://search.emarefa.net/detail/BIM-942185
نمط استشهاد الجمعية الطبية الأمريكية (AMA)
al-Maghribi, Islam Id Ali Muhammad& Judi, Amr Muhammad Rifat& Faruq, Hisham Muhammad. Enhancement quality and accuracy of speech recognition system using multimodal audio-visual speech signal. The Egyptian Journal of Language Engineering. 2017. Vol. 4, no. 2, pp.27-40.
https://search.emarefa.net/detail/BIM-942185
نوع البيانات
مقالات
لغة النص
الإنجليزية
الملاحظات
رقم السجل
BIM-942185
قاعدة معامل التأثير والاستشهادات المرجعية العربي "ارسيف Arcif"
أضخم قاعدة بيانات عربية للاستشهادات المرجعية للمجلات العلمية المحكمة الصادرة في العالم العربي
تقوم هذه الخدمة بالتحقق من التشابه أو الانتحال في الأبحاث والمقالات العلمية والأطروحات الجامعية والكتب والأبحاث باللغة العربية، وتحديد درجة التشابه أو أصالة الأعمال البحثية وحماية ملكيتها الفكرية. تعرف اكثر