Selection of In-Domain Bilingual Sentence Pairs Based on Topic Information

المؤلفون المشاركون

Li, Bin
Yao, Jianmin

المصدر

Scientific Programming

العدد

المجلد 2020، العدد 2020 (31 ديسمبر/كانون الأول 2020)، ص ص. 1-7، 7ص.

الناشر

Hindawi Publishing Corporation

تاريخ النشر

2020-12-15

دولة النشر

مصر

عدد الصفحات

7

التخصصات الرئيسية

الرياضيات

الملخص EN

The performance of a machine translation system (MTS) depends on the quality and size of the training data.

How to extend the training dataset for the MTS in specific domains with effective methods to enhance the performance of machine translation needs to be explored.

A method for selecting in-domain bilingual sentence pairs based on the topic information is proposed.

With the aid of the topic relevance of the bilingual sentence pairs to the target domain, subsets of sentence pairs related to the texts to be translated are selected from a large-scale bilingual corpus to train the translation system in specific domains to improve the translation quality for in-domain texts.

Through the test, the bilingual sentence pairs are selected by using the proposed method, and further the MTS is trained.

In this way, the translation performance is greatly enhanced.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Li, Bin& Yao, Jianmin. 2020. Selection of In-Domain Bilingual Sentence Pairs Based on Topic Information. Scientific Programming،Vol. 2020, no. 2020, pp.1-7.
https://search.emarefa.net/detail/BIM-1209294

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Li, Bin& Yao, Jianmin. Selection of In-Domain Bilingual Sentence Pairs Based on Topic Information. Scientific Programming No. 2020 (2020), pp.1-7.
https://search.emarefa.net/detail/BIM-1209294

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Li, Bin& Yao, Jianmin. Selection of In-Domain Bilingual Sentence Pairs Based on Topic Information. Scientific Programming. 2020. Vol. 2020, no. 2020, pp.1-7.
https://search.emarefa.net/detail/BIM-1209294

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references

رقم السجل

BIM-1209294