Deep Metric Learning-Assisted 3D Audio-Visual Speaker Tracking via Two-Layer Particle Filter

المؤلفون المشاركون

Chen, Yang
Yang, Bing
Li, Yidi
Ding, Runwei
Liu, Hong

المصدر

Complexity

العدد

المجلد 2020، العدد 2020 (31 ديسمبر/كانون الأول 2020)، ص ص. 1-8، 8ص.

الناشر

Hindawi Publishing Corporation

تاريخ النشر

2020-08-31

دولة النشر

مصر

عدد الصفحات

8

التخصصات الرئيسية

الفلسفة

الملخص EN

For speaker tracking, integrating multimodal information from audio and video provides an effective and promising solution.

The current challenges are focused on the construction of a stable observation model.

To this end, we propose a 3D audio-visual speaker tracker assisted by deep metric learning on the two-layer particle filter framework.

Firstly, the audio-guided motion model is applied to generate candidate samples in the hierarchical structure consisting of an audio layer and a visual layer.

Then, a stable observation model is proposed with a designed Siamese network, which provides the similarity-based likelihood to calculate particle weights.

The speaker position is estimated using an optimal particle set, which integrates the decisions from audio particles and visual particles.

Finally, the long short-term mechanism-based template update strategy is adopted to prevent drift during tracking.

Experimental results demonstrate that the proposed method outperforms the single-modal trackers and comparison methods.

Efficient and robust tracking is achieved both in 3D space and on image plane.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Li, Yidi& Liu, Hong& Yang, Bing& Ding, Runwei& Chen, Yang. 2020. Deep Metric Learning-Assisted 3D Audio-Visual Speaker Tracking via Two-Layer Particle Filter. Complexity،Vol. 2020, no. 2020, pp.1-8.
https://search.emarefa.net/detail/BIM-1141667

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Li, Yidi…[et al.]. Deep Metric Learning-Assisted 3D Audio-Visual Speaker Tracking via Two-Layer Particle Filter. Complexity No. 2020 (2020), pp.1-8.
https://search.emarefa.net/detail/BIM-1141667

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Li, Yidi& Liu, Hong& Yang, Bing& Ding, Runwei& Chen, Yang. Deep Metric Learning-Assisted 3D Audio-Visual Speaker Tracking via Two-Layer Particle Filter. Complexity. 2020. Vol. 2020, no. 2020, pp.1-8.
https://search.emarefa.net/detail/BIM-1141667

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references

رقم السجل

BIM-1141667