![](/images/graphics-bg.png)
Distance Variance Score: An Efficient Feature Selection Method in Text Classification
Joint Authors
Source
Mathematical Problems in Engineering
Issue
Vol. 2015, Issue 2015 (31 Dec. 2015), pp.1-10, 10 p.
Publisher
Hindawi Publishing Corporation
Publication Date
2015-05-11
Country of Publication
Egypt
No. of Pages
10
Main Subjects
Abstract EN
With the rapid development of web applications such as social network, a large amount of electric text data is accumulated and available on the Internet, which causes increasing interests in text mining.
Text classification is one of the most important subfields of text mining.
In fact, text documents are often represented as a high-dimensional sparse document term matrix (DTM) before classification.
Feature selection is essential and vital for text classification due to high dimensionality and sparsity of DTM.
An efficient feature selection method is capable of both reducing dimensions of DTM and selecting discriminative features for text classification.
Laplacian Score (LS) is one of the unsupervised feature selection methods and it has been successfully used in areas such as face recognition.
However, LS is unable to select discriminative features for text classification and to effectively reduce the sparsity of DTM.
To improve it, this paper proposes an unsupervised feature selection method named Distance Variance Score (DVS).
DVS uses feature distance contribution (a ratio) to rank the importance of features for text documents so as to select discriminative features.
Experimental results indicate that DVS is able to select discriminative features and reduce the sparsity of DTM.
Thus, it is much more efficient than LS.
American Psychological Association (APA)
Wang, Heyong& Hong, Ming. 2015. Distance Variance Score: An Efficient Feature Selection Method in Text Classification. Mathematical Problems in Engineering،Vol. 2015, no. 2015, pp.1-10.
https://search.emarefa.net/detail/BIM-1074488
Modern Language Association (MLA)
Wang, Heyong& Hong, Ming. Distance Variance Score: An Efficient Feature Selection Method in Text Classification. Mathematical Problems in Engineering No. 2015 (2015), pp.1-10.
https://search.emarefa.net/detail/BIM-1074488
American Medical Association (AMA)
Wang, Heyong& Hong, Ming. Distance Variance Score: An Efficient Feature Selection Method in Text Classification. Mathematical Problems in Engineering. 2015. Vol. 2015, no. 2015, pp.1-10.
https://search.emarefa.net/detail/BIM-1074488
Data Type
Journal Articles
Language
English
Notes
Includes bibliographical references
Record ID
BIM-1074488