Age estimation in short speech utterances based on bidirectional gated-recurrent neural networks
Joint Authors
Abd Hasan, Alya K.
Badr, Amir Abd al-Baqi
Source
Engineering and Technology Journal
Issue
Vol. 39, Issue 1B (31 Jan. 2021), pp.129-140, 12 p.
Publisher
Publication Date
2021-01-31
Country of Publication
Iraq
No. of Pages
12
Main Subjects
Information Technology and Computer Science
Topics
Abstract EN
Recently, age estimates from speech have received growing interest asthey are important for many applications like custom call routing, targetedmarketing, or user-profiling.
In this work, an automatic system to estimateage in short speech utterances without depending on the text is proposed.
From each utterance frame, four groups of features are extracted and then10 statistical functionals are measured for each extracted dimension of thefeatures, to be followed by dimensionality reduction usingLinear Discriminant Analysis (LDA).
Finally, bidirectional GatedRecurrent Neural Networks (G- RNNs) are used to predict speaker age.
Experiments are conducted on the VoxCeleb1 dataset to show theperformance of the proposed system, which is the first attempt to do sofor such a system.
In gender-dependent system, the Mean olute Error(MAE) of the proposed system is 9.25 years, and 10.33 years, the RootMean Square Error (RMSE)is 13.17 and 13.26, respectively, for femaleand male speakers.
In gender_ independent system, the MAE of theproposed system is 10.96 years, and the RMSE is 15.47.
The results showthat the proposed system has a good performance on short-durationutterances, taking into consideration the high noise ratio in the Recently, age estimates from speech have received growing interest asthey are important for many applications like custom call routing, targetedmarketing, or user-profiling.
In this work, an automatic system to estimateage in short speech utterances without depending on the text is proposed.
From each utterance frame, four groups of features are extracted and then10 statistical functionals are measured for each extracted dimension of thefeatures, to be followed by dimensionality reduction usingLinear Discriminant Analysis (LDA).
Finally, bidirectional GatedRecurrent Neural Networks (G- RNNs) are used to predict speaker age.
Experiments are conducted on the VoxCeleb1 dataset to show theperformance of the proposed system, which is the first attempt to do sofor such a system.
In gender-dependent system, the Mean olute Error(MAE) of the proposed system is 9.25 years, and 10.33 years, the RootMean Square Error (RMSE)is 13.17 and 13.26, respectively, for femaleand male speakers.
In gender_ independent system, the MAE of theproposed system is 10.96 years, and the RMSE is 15.47.
The results showthat the proposed system has a good performance on short-durationutterances, taking into consideration the high noise ratio in the VoxCeleb1dataset.
American Psychological Association (APA)
Badr, Amir Abd al-Baqi& Abd Hasan, Alya K.. 2021. Age estimation in short speech utterances based on bidirectional gated-recurrent neural networks. Engineering and Technology Journal،Vol. 39, no. 1B, pp.129-140.
https://search.emarefa.net/detail/BIM-1282621
Modern Language Association (MLA)
Badr, Amir Abd al-Baqi& Abd Hasan, Alya K.. Age estimation in short speech utterances based on bidirectional gated-recurrent neural networks. Engineering and Technology Journal Vol. 39, no. 1B (2021), pp.129-140.
https://search.emarefa.net/detail/BIM-1282621
American Medical Association (AMA)
Badr, Amir Abd al-Baqi& Abd Hasan, Alya K.. Age estimation in short speech utterances based on bidirectional gated-recurrent neural networks. Engineering and Technology Journal. 2021. Vol. 39, no. 1B, pp.129-140.
https://search.emarefa.net/detail/BIM-1282621
Data Type
Journal Articles
Language
English
Notes
-
Record ID
BIM-1282621