Chinese Named Entity Recognition Based on Character-Word Vector Fusion

المؤلفون المشاركون

Dong, Lili
Zhang, Xiang
Qin, Xin
Ye, Na
Sun, Kangkang

المصدر

Wireless Communications and Mobile Computing

العدد

المجلد 2020، العدد 2020 (31 ديسمبر/كانون الأول 2020)، ص ص. 1-7، 7ص.

الناشر

Hindawi Publishing Corporation

تاريخ النشر

2020-07-04

دولة النشر

مصر

عدد الصفحات

7

التخصصات الرئيسية

تكنولوجيا المعلومات وعلم الحاسوب

الملخص EN

Due to the lack of explicit markers in Chinese text to define the boundaries of words, it is often more difficult to identify named entities in Chinese than in English.

At present, the pretreatment of the character or word vector models is adopted in the training of the Chinese named entity recognition model.

Aimed at the problems that taking character vector as an input of the neural network cannot use the words’ semantic meanings and give up the words’ explicit boundary information, and taking the word vector as an input of the neural network relies on the accuracy of the segmentation algorithms, a Chinese named entity recognition model based on character word vector fusion CWVF-BiLSTM-CRF (Character Word Vector Fusion-Bidirectional Long-Short Term Memory Networks-Conditional Random Field) is proposed in this paper.

First, the Word2Vec is used to obtain the corresponding dictionaries of character-character vector and word-word vector.

Second, the character-word vector is integrated as the input unit of the BiLSTM (Bidirectional Long-Short Term Memory) network, and then, the problem of an unreasonable tag sequence is solved using the CRF (conditional random field).

By using the presented model, the dependence on the accuracy of the word segmentation algorithm is reduced, and the words’ semantic characteristics are effectively applied.

The experimental results show that the model based on character-word vector fusion improves the recognition effect of the Chinese named entity.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Ye, Na& Qin, Xin& Dong, Lili& Zhang, Xiang& Sun, Kangkang. 2020. Chinese Named Entity Recognition Based on Character-Word Vector Fusion. Wireless Communications and Mobile Computing،Vol. 2020, no. 2020, pp.1-7.
https://search.emarefa.net/detail/BIM-1214799

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Ye, Na…[et al.]. Chinese Named Entity Recognition Based on Character-Word Vector Fusion. Wireless Communications and Mobile Computing No. 2020 (2020), pp.1-7.
https://search.emarefa.net/detail/BIM-1214799

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Ye, Na& Qin, Xin& Dong, Lili& Zhang, Xiang& Sun, Kangkang. Chinese Named Entity Recognition Based on Character-Word Vector Fusion. Wireless Communications and Mobile Computing. 2020. Vol. 2020, no. 2020, pp.1-7.
https://search.emarefa.net/detail/BIM-1214799

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references

رقم السجل

BIM-1214799