Selection of In-Domain Bilingual Sentence Pairs Based on Topic Information
Joint Authors
Source
Issue
Vol. 2020, Issue 2020 (31 Dec. 2020), pp.1-7, 7 p.
Publisher
Hindawi Publishing Corporation
Publication Date
2020-12-15
Country of Publication
Egypt
No. of Pages
7
Main Subjects
Abstract EN
The performance of a machine translation system (MTS) depends on the quality and size of the training data.
How to extend the training dataset for the MTS in specific domains with effective methods to enhance the performance of machine translation needs to be explored.
A method for selecting in-domain bilingual sentence pairs based on the topic information is proposed.
With the aid of the topic relevance of the bilingual sentence pairs to the target domain, subsets of sentence pairs related to the texts to be translated are selected from a large-scale bilingual corpus to train the translation system in specific domains to improve the translation quality for in-domain texts.
Through the test, the bilingual sentence pairs are selected by using the proposed method, and further the MTS is trained.
In this way, the translation performance is greatly enhanced.
American Psychological Association (APA)
Li, Bin& Yao, Jianmin. 2020. Selection of In-Domain Bilingual Sentence Pairs Based on Topic Information. Scientific Programming،Vol. 2020, no. 2020, pp.1-7.
https://search.emarefa.net/detail/BIM-1209294
Modern Language Association (MLA)
Li, Bin& Yao, Jianmin. Selection of In-Domain Bilingual Sentence Pairs Based on Topic Information. Scientific Programming No. 2020 (2020), pp.1-7.
https://search.emarefa.net/detail/BIM-1209294
American Medical Association (AMA)
Li, Bin& Yao, Jianmin. Selection of In-Domain Bilingual Sentence Pairs Based on Topic Information. Scientific Programming. 2020. Vol. 2020, no. 2020, pp.1-7.
https://search.emarefa.net/detail/BIM-1209294
Data Type
Journal Articles
Language
English
Notes
Includes bibliographical references
Record ID
BIM-1209294