Simple-Random-Sampling-Based Multiclass Text Classification Algorithm

Joint Authors

Liu, Wuying
Wang, Lin
Yi, Mianzhu

Source

The Scientific World Journal

Issue

Vol. 2014, Issue 2014 (31 Dec. 2014), pp.1-7, 7 p.

Publisher

Hindawi Publishing Corporation

Publication Date

2014-03-19

Country of Publication

Egypt

No. of Pages

7

Main Subjects

Medicine
Information Technology and Computer Science

Abstract EN

Multiclass text classification (MTC) is a challenging issue and the corresponding MTC algorithms can be used in many applications.

The space-time overhead of the algorithms must be concerned about the era of big data.

Through the investigation of the token frequency distribution in a Chinese web document collection, this paper reexamines the power law and proposes a simple-random-sampling-based MTC (SRSMTC) algorithm.

Supported by a token level memory to store labeled documents, the SRSMTC algorithm uses a text retrieval approach to solve text classification problems.

The experimental results on the TanCorp data set show that SRSMTC algorithm can achieve the state-of-the-art performance at greatly reduced space-time requirements.

American Psychological Association (APA)

Liu, Wuying& Wang, Lin& Yi, Mianzhu. 2014. Simple-Random-Sampling-Based Multiclass Text Classification Algorithm. The Scientific World Journal،Vol. 2014, no. 2014, pp.1-7.
https://search.emarefa.net/detail/BIM-1049939

Modern Language Association (MLA)

Liu, Wuying…[et al.]. Simple-Random-Sampling-Based Multiclass Text Classification Algorithm. The Scientific World Journal No. 2014 (2014), pp.1-7.
https://search.emarefa.net/detail/BIM-1049939

American Medical Association (AMA)

Liu, Wuying& Wang, Lin& Yi, Mianzhu. Simple-Random-Sampling-Based Multiclass Text Classification Algorithm. The Scientific World Journal. 2014. Vol. 2014, no. 2014, pp.1-7.
https://search.emarefa.net/detail/BIM-1049939

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references

Record ID

BIM-1049939