Age estimation in short speech utterances based on bidirectional gated-recurrent neural networks

Joint Authors

Abd Hasan, Alya K.
Badr, Amir Abd al-Baqi

Source

Engineering and Technology Journal

Issue

Vol. 39, Issue 1B (31 Jan. 2021), pp.129-140, 12 p.

Publisher

University of Technology

Publication Date

2021-01-31

Country of Publication

Iraq

No. of Pages

12

Main Subjects

Information Technology and Computer Science

Topics

Abstract EN

Recently, age estimates from speech have received growing interest asthey are important for many applications like custom call routing, targetedmarketing, or user-profiling.

In this work, an automatic system to estimateage in short speech utterances without depending on the text is proposed.

From each utterance frame, four groups of features are extracted and then10 statistical functionals are measured for each extracted dimension of thefeatures, to be followed by dimensionality reduction usingLinear Discriminant Analysis (LDA).

Finally, bidirectional GatedRecurrent Neural Networks (G- RNNs) are used to predict speaker age.

Experiments are conducted on the VoxCeleb1 dataset to show theperformance of the proposed system, which is the first attempt to do sofor such a system.

In gender-dependent system, the Mean olute Error(MAE) of the proposed system is 9.25 years, and 10.33 years, the RootMean Square Error (RMSE)is 13.17 and 13.26, respectively, for femaleand male speakers.

In gender_ independent system, the MAE of theproposed system is 10.96 years, and the RMSE is 15.47.

The results showthat the proposed system has a good performance on short-durationutterances, taking into consideration the high noise ratio in the Recently, age estimates from speech have received growing interest asthey are important for many applications like custom call routing, targetedmarketing, or user-profiling.

In this work, an automatic system to estimateage in short speech utterances without depending on the text is proposed.

From each utterance frame, four groups of features are extracted and then10 statistical functionals are measured for each extracted dimension of thefeatures, to be followed by dimensionality reduction usingLinear Discriminant Analysis (LDA).

Finally, bidirectional GatedRecurrent Neural Networks (G- RNNs) are used to predict speaker age.

Experiments are conducted on the VoxCeleb1 dataset to show theperformance of the proposed system, which is the first attempt to do sofor such a system.

In gender-dependent system, the Mean olute Error(MAE) of the proposed system is 9.25 years, and 10.33 years, the RootMean Square Error (RMSE)is 13.17 and 13.26, respectively, for femaleand male speakers.

In gender_ independent system, the MAE of theproposed system is 10.96 years, and the RMSE is 15.47.

The results showthat the proposed system has a good performance on short-durationutterances, taking into consideration the high noise ratio in the VoxCeleb1dataset.

American Psychological Association (APA)

Badr, Amir Abd al-Baqi& Abd Hasan, Alya K.. 2021. Age estimation in short speech utterances based on bidirectional gated-recurrent neural networks. Engineering and Technology Journal،Vol. 39, no. 1B, pp.129-140.
https://search.emarefa.net/detail/BIM-1282621

Modern Language Association (MLA)

Badr, Amir Abd al-Baqi& Abd Hasan, Alya K.. Age estimation in short speech utterances based on bidirectional gated-recurrent neural networks. Engineering and Technology Journal Vol. 39, no. 1B (2021), pp.129-140.
https://search.emarefa.net/detail/BIM-1282621

American Medical Association (AMA)

Badr, Amir Abd al-Baqi& Abd Hasan, Alya K.. Age estimation in short speech utterances based on bidirectional gated-recurrent neural networks. Engineering and Technology Journal. 2021. Vol. 39, no. 1B, pp.129-140.
https://search.emarefa.net/detail/BIM-1282621

Data Type

Journal Articles

Language

English

Notes

-

Record ID

BIM-1282621