Distributional Similarity for Chinese: Exploiting Characters and Radicals

Joint Authors

Jin, Peng
Carroll, John
Wu, Yunfang
McCarthy, Diana

Source

Mathematical Problems in Engineering

Issue

Vol. 2012, Issue 2012 (31 Dec. 2012), pp.1-11, 11 p.

Publisher

Hindawi Publishing Corporation

Publication Date

2012-08-15

Country of Publication

Egypt

No. of Pages

11

Main Subjects

Civil Engineering

Abstract EN

Distributional Similarity has attracted considerable attention in the field of natural language processing as an automatic means of countering the ubiquitous problem of sparse data.

As a logographic language, Chinese words consist of characters and each of them is composed of one or more radicals.

The meanings of characters are usually highly related to the words which contain them.

Likewise, radicals often make a predictable contribution to the meaning of a character: characters that have the same components tend to have similar or related meanings.

In this paper, we utilize these properties of the Chinese language to improve Chinese word similarity computation.

Given a content word, we first extract similar words based on a large corpus and a similarity score for ranking.

This rank is then adjusted according to the characters and components shared between the similar word and the target word.

Experiments on two gold standard datasets show that the adjusted rank is superior and closer to human judgments than the original rank.

In addition to quantitative evaluation, we examine the reasons behind errors drawing on linguistic phenomena for our explanations.

American Psychological Association (APA)

Jin, Peng& Carroll, John& Wu, Yunfang& McCarthy, Diana. 2012. Distributional Similarity for Chinese: Exploiting Characters and Radicals. Mathematical Problems in Engineering،Vol. 2012, no. 2012, pp.1-11.
https://search.emarefa.net/detail/BIM-1001526

Modern Language Association (MLA)

Jin, Peng…[et al.]. Distributional Similarity for Chinese: Exploiting Characters and Radicals. Mathematical Problems in Engineering No. 2012 (2012), pp.1-11.
https://search.emarefa.net/detail/BIM-1001526

American Medical Association (AMA)

Jin, Peng& Carroll, John& Wu, Yunfang& McCarthy, Diana. Distributional Similarity for Chinese: Exploiting Characters and Radicals. Mathematical Problems in Engineering. 2012. Vol. 2012, no. 2012, pp.1-11.
https://search.emarefa.net/detail/BIM-1001526

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references

Record ID

BIM-1001526