Speaker identification using wavelet transform and probabilistic neural network

Other Title(s)

معرفة الشخص المتكلم باستخدام تقنية التحول المويجي و الشبكات العصبية

Publication Date

2011-03-31

Country of Publication

Iraq

No. of Pages

Main Subjects

Information Technology and Computer Science

Abstract AR

معرفة الشخص المتكلم هو آلية للتعرف ذاتيا على الشخص بالاعتماد على معلومات فريدة متضمنة في الموجات الكلامية أو الصوتية الصادرة من الشخص المتحدث.

هذه التقنية تجعل من الممكن استخدام صوت الشخص للتحقق من هويته، للسيطرة على الوصول لخدمات أخرى مثل التعامل مع البنوك، أو التسوق من خلال جهاز التليفون أو الوصول إلى بيانات ضرورية، أو السيطرة الأمنية للتعامل مع المعلومات.

آما الهدف من البحث هو بناء نظام معرفة المتكلم الذاتي لمجموعة محددة من الأشخاص في حالتي النص المعتمد و النص غير المعتمد.

إن العمل بصورة عامة يتكون من طورين و هما : طور التدريب، الذي يتضمن بناء قاعدة بيانات تشمل كل المتكلمين، و الطور الثاني هو طور الاختبار أو التعرف على المتكلم، الذي يتضمن عملية مقارنة ما بين الأنموذج غير المعرف مع التقدير قاعدة البيانات، لتحديد المتكلم.

إن تقنية التحول المويجي كانت قد انتشرت في معظم تطبيقات معالجة الإشارة الرقمية و قد أدى دورا مهما في معالجة الإشارة الصوتية و تحليلاتها، و خاصة في تقنية معرفة المتكلم و ذلك بسبب أدائه الأقوى فيما يتعلق بالتحليلات المتعددة الانتشار النظام المقترح يتشكل من ثلاث مراحل و هي : المرحلة الأولى و هي مرحلة التهيؤ للمعالجة و فيها يتم تقطيع الإشارة إلى اطر عديدة و كل واحد من هذه الأطر سوف يضرب ب Hamming window آما المرحلة الثانية فهي مرحلة استخلاص الخواص و فيها يتم استخلاص الصفات المميزة لكل كلمة مدخلة باستخدام تقنية (التحول المويجي) و أن النتيجة من هذه المرحلة هو متجه خواص.

و في المرحلة الأخيرة و هي مرحلة التصنيف و فيها يتم استعمال (متجه الخواص) المنتج من المرحلة السابقة بوصفه مدخلا للشبكة العصبية.

القيم الناتجة تبين أن الخواص الصوتية تكون فعالة جدا في معرفة المتكلم، و أن الخوارزمية المقترحة هي فعالة فيما يتعلق ب (تقليل العمليات الحسابية، و تقليل الزمن المستغرق في التنفيذ).

Abstract EN

Speaker identification is the process of automatically identify who is speaking on the basis of individual information included in speech waves.

This technique makes it possible to use the speakers voices to verify their identity and control access to services such as voice dialing, banking by telephone, telephone shopping, database access services, information services, voice mail, security control for confidential information areas, and remote access to computers.

The goal of this project is to build an automatic speaker identification system for a closed set Text-Dependent & Text-Independent.

This generally will include two main phases : The Training Phase, which is used to built the speakers database and the Identification (Testing) Phase, which is used to compare the unknown model with the models stored in the speakers Data base.

The wavelet transform is diffused into most digital signal processing applications.

It is plays very important role in speech signal processing and analysis, and mainly in speaker identification because of its superior performance when used particularly in multi-resolution analysis.

The proposed system constructs from three stages, the first stage is the Preprocessing stage, in which the speech signal is separated into many frames, and each frame is multiplied by (Hamming Window).

In feature extraction stage, which is the second stage, the discriminative features of each spoken words are extracted by using the DWT technique, the resultant of this stage is the feature vector for each speaker.

In the third stage, which is the classification stage, the feature vector of each speaker is used as an input to the neural network.

The results show that phonetic features are powerful for speaker identification and the proposed algorithm is efficient concerning the minimizing of the calculation operations and reducing the execution time.

American Psychological Association (APA)

Abd al-Latif, Sali Ali& Yusuf, Intisar Abd. 2011. Speaker identification using wavelet transform and probabilistic neural network. Ibn al-Haitham Journal for Pure and Applied Science،Vol. 24, no. 1.
https://search.emarefa.net/detail/BIM-286788

Modern Language Association (MLA)

Abd al-Latif, Sali Ali& Yusuf, Intisar Abd. Speaker identification using wavelet transform and probabilistic neural network. Ibn al-Haitham Journal for Pure and Applied Science Vol. 24, no. 1 (2011).
https://search.emarefa.net/detail/BIM-286788

American Medical Association (AMA)

Abd al-Latif, Sali Ali& Yusuf, Intisar Abd. Speaker identification using wavelet transform and probabilistic neural network. Ibn al-Haitham Journal for Pure and Applied Science. 2011. Vol. 24, no. 1.
https://search.emarefa.net/detail/BIM-286788

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references

Record ID

BIM-286788

SaveSaved Print

Arab Citation & Impact Factor "Arcif"

Largest Arabic Database of Citations Analysis for the Arabic Scholarly Journals Issued in Arab World.

eMarefa Indicators
for Arab Scientific Production

"Kashif" for Checking Similarity or Plagiarism in the Arabic Researches. know more