Multitask Learning with Local Attention for Tibetan Speech Recognition

Joint Authors

Wang, Hui
Gao, Fei
Zhao, Yue
Yang, Li
Yue, Jianjian
Ma, Huilin

Source

Complexity

Issue

Vol. 2020, Issue 2020 (31 Dec. 2020), pp.1-10, 10 p.

Publisher

Hindawi Publishing Corporation

Publication Date

2020-12-18

Country of Publication

Egypt

No. of Pages

10

Main Subjects

Philosophy

Abstract EN

In this paper, we propose to incorporate the local attention in WaveNet-CTC to improve the performance of Tibetan speech recognition in multitask learning.

With an increase in task number, such as simultaneous Tibetan speech content recognition, dialect identification, and speaker recognition, the accuracy rate of a single WaveNet-CTC decreases on speech recognition.

Inspired by the attention mechanism, we introduce the local attention to automatically tune the weights of feature frames in a window and pay different attention on context information for multitask learning.

The experimental results show that our method improves the accuracies of speech recognition for all Tibetan dialects in three-task learning, compared with the baseline model.

Furthermore, our method significantly improves the accuracy for low-resource dialect by 5.11% against the specific-dialect model.

American Psychological Association (APA)

Wang, Hui& Gao, Fei& Zhao, Yue& Yang, Li& Yue, Jianjian& Ma, Huilin. 2020. Multitask Learning with Local Attention for Tibetan Speech Recognition. Complexity،Vol. 2020, no. 2020, pp.1-10.
https://search.emarefa.net/detail/BIM-1145206

Modern Language Association (MLA)

Wang, Hui…[et al.]. Multitask Learning with Local Attention for Tibetan Speech Recognition. Complexity No. 2020 (2020), pp.1-10.
https://search.emarefa.net/detail/BIM-1145206

American Medical Association (AMA)

Wang, Hui& Gao, Fei& Zhao, Yue& Yang, Li& Yue, Jianjian& Ma, Huilin. Multitask Learning with Local Attention for Tibetan Speech Recognition. Complexity. 2020. Vol. 2020, no. 2020, pp.1-10.
https://search.emarefa.net/detail/BIM-1145206

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references

Record ID

BIM-1145206