Chinese Named Entity Recognition Based on Character-Word Vector Fusion

Joint Authors

Dong, Lili
Zhang, Xiang
Qin, Xin
Ye, Na
Sun, Kangkang

Source

Wireless Communications and Mobile Computing

Issue

Vol. 2020, Issue 2020 (31 Dec. 2020), pp.1-7, 7 p.

Publisher

Hindawi Publishing Corporation

Publication Date

2020-07-04

Country of Publication

Egypt

No. of Pages

7

Main Subjects

Information Technology and Computer Science

Abstract EN

Due to the lack of explicit markers in Chinese text to define the boundaries of words, it is often more difficult to identify named entities in Chinese than in English.

At present, the pretreatment of the character or word vector models is adopted in the training of the Chinese named entity recognition model.

Aimed at the problems that taking character vector as an input of the neural network cannot use the words’ semantic meanings and give up the words’ explicit boundary information, and taking the word vector as an input of the neural network relies on the accuracy of the segmentation algorithms, a Chinese named entity recognition model based on character word vector fusion CWVF-BiLSTM-CRF (Character Word Vector Fusion-Bidirectional Long-Short Term Memory Networks-Conditional Random Field) is proposed in this paper.

First, the Word2Vec is used to obtain the corresponding dictionaries of character-character vector and word-word vector.

Second, the character-word vector is integrated as the input unit of the BiLSTM (Bidirectional Long-Short Term Memory) network, and then, the problem of an unreasonable tag sequence is solved using the CRF (conditional random field).

By using the presented model, the dependence on the accuracy of the word segmentation algorithm is reduced, and the words’ semantic characteristics are effectively applied.

The experimental results show that the model based on character-word vector fusion improves the recognition effect of the Chinese named entity.

American Psychological Association (APA)

Ye, Na& Qin, Xin& Dong, Lili& Zhang, Xiang& Sun, Kangkang. 2020. Chinese Named Entity Recognition Based on Character-Word Vector Fusion. Wireless Communications and Mobile Computing،Vol. 2020, no. 2020, pp.1-7.
https://search.emarefa.net/detail/BIM-1214799

Modern Language Association (MLA)

Ye, Na…[et al.]. Chinese Named Entity Recognition Based on Character-Word Vector Fusion. Wireless Communications and Mobile Computing No. 2020 (2020), pp.1-7.
https://search.emarefa.net/detail/BIM-1214799

American Medical Association (AMA)

Ye, Na& Qin, Xin& Dong, Lili& Zhang, Xiang& Sun, Kangkang. Chinese Named Entity Recognition Based on Character-Word Vector Fusion. Wireless Communications and Mobile Computing. 2020. Vol. 2020, no. 2020, pp.1-7.
https://search.emarefa.net/detail/BIM-1214799

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references

Record ID

BIM-1214799