A Systematic Evaluation of Feature Selection and Classification Algorithms Using Simulated and Real miRNA Sequencing Data

Joint Authors

Yang, Sheng
Guo, Li
Shao, Fang
Zhao, Yang
Chen, Feng

Source

Computational and Mathematical Methods in Medicine

Issue

Vol. 2015, Issue 2015 (31 Dec. 2015), pp.1-11, 11 p.

Publisher

Hindawi Publishing Corporation

Publication Date

2015-10-05

Country of Publication

Egypt

No. of Pages

11

Main Subjects

Medicine

Abstract EN

Sequencing is widely used to discover associations between microRNAs (miRNAs) and diseases.

However, the negative binomial distribution (NB) and high dimensionality of data obtained using sequencing can lead to low-power results and low reproducibility.

Several statistical learning algorithms have been proposed to address sequencing data, and although evaluation of these methods is essential, such studies are relatively rare.

The performance of seven feature selection (FS) algorithms, including baySeq, DESeq, edgeR, the rank sum test, lasso, particle swarm optimistic decision tree, and random forest (RF), was compared by simulation under different conditions based on the difference of the mean, the dispersion parameter of the NB, and the signal to noise ratio.

Real data were used to evaluate the performance of RF, logistic regression, and support vector machine.

Based on the simulation and real data, we discuss the behaviour of the FS and classification algorithms.

The Apriori algorithm identified frequent item sets (mir-133a, mir-133b, mir-183, mir-937, and mir-96) from among the deregulated miRNAs of six datasets from The Cancer Genomics Atlas.

Taking these findings altogether and considering computational memory requirements, we propose a strategy that combines edgeR and DESeq for large sample sizes.

American Psychological Association (APA)

Yang, Sheng& Guo, Li& Shao, Fang& Zhao, Yang& Chen, Feng. 2015. A Systematic Evaluation of Feature Selection and Classification Algorithms Using Simulated and Real miRNA Sequencing Data. Computational and Mathematical Methods in Medicine،Vol. 2015, no. 2015, pp.1-11.
https://search.emarefa.net/detail/BIM-1057823

Modern Language Association (MLA)

Yang, Sheng…[et al.]. A Systematic Evaluation of Feature Selection and Classification Algorithms Using Simulated and Real miRNA Sequencing Data. Computational and Mathematical Methods in Medicine No. 2015 (2015), pp.1-11.
https://search.emarefa.net/detail/BIM-1057823

American Medical Association (AMA)

Yang, Sheng& Guo, Li& Shao, Fang& Zhao, Yang& Chen, Feng. A Systematic Evaluation of Feature Selection and Classification Algorithms Using Simulated and Real miRNA Sequencing Data. Computational and Mathematical Methods in Medicine. 2015. Vol. 2015, no. 2015, pp.1-11.
https://search.emarefa.net/detail/BIM-1057823

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references

Record ID

BIM-1057823