Mixed-Level Neural Machine Translation

Joint Authors

Nguyen, Thien
Nguyen, Huu
Tran, Phuoc

Source

Computational Intelligence and Neuroscience

Issue

Vol. 2020, Issue 2020 (31 Dec. 2020), pp.1-7, 7 p.

Publisher

Hindawi Publishing Corporation

Publication Date

2020-11-29

Country of Publication

Egypt

No. of Pages

7

Main Subjects

Biology

Abstract EN

Building the first Russian-Vietnamese neural machine translation system, we faced the problem of choosing a translation unit system on which source and target embeddings are based.

Available homogeneous translation unit systems with the same translation unit on the source and target sides do not perfectly suit the investigated language pair.

To solve the problem, in this paper, we propose a novel heterogeneous translation unit system, considering linguistic characteristics of the synthetic Russian language and the analytic Vietnamese language.

Specifically, we decrease the embedding level on the source side by splitting token into subtokens and increase the embedding level on the target side by merging neighboring tokens into supertoken.

The experiment results show that the proposed heterogeneous system improves over the existing best homogeneous Russian-Vietnamese translation system by 1.17 BLEU.

Our approach could be applied to building translation bots for language pairs with different linguistic characteristics.

American Psychological Association (APA)

Nguyen, Thien& Nguyen, Huu& Tran, Phuoc. 2020. Mixed-Level Neural Machine Translation. Computational Intelligence and Neuroscience،Vol. 2020, no. 2020, pp.1-7.
https://search.emarefa.net/detail/BIM-1138918

Modern Language Association (MLA)

Nguyen, Thien…[et al.]. Mixed-Level Neural Machine Translation. Computational Intelligence and Neuroscience No. 2020 (2020), pp.1-7.
https://search.emarefa.net/detail/BIM-1138918

American Medical Association (AMA)

Nguyen, Thien& Nguyen, Huu& Tran, Phuoc. Mixed-Level Neural Machine Translation. Computational Intelligence and Neuroscience. 2020. Vol. 2020, no. 2020, pp.1-7.
https://search.emarefa.net/detail/BIM-1138918

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references

Record ID

BIM-1138918