Metagenome Fragment Classification Using N-Mer Frequency Profiles

المؤلفون المشاركون

Rosen, Gail L.
Garbarine, Elaine
Polikar, Robi
Caseiro, Diamantino
Sokhansanj, Bahrad

المصدر

Advances in Bioinformatics

العدد

المجلد 2008، العدد 2008 (31 ديسمبر/كانون الأول 2008)، ص ص. 1-12، 12ص.

الناشر

Hindawi Publishing Corporation

تاريخ النشر

2008-11-16

دولة النشر

مصر

عدد الصفحات

12

التخصصات الرئيسية

العلوم الطبيعية والحياتية (متداخلة التخصصات)
الأحياء

الملخص EN

A vast amount of microbial sequencing data is being generated through large-scale projects in ecology, agriculture, and human health.

Efficient high-throughput methods are needed to analyze the mass amounts of metagenomic data, all DNA present in an environmental sample.

A major obstacle in metagenomics is the inability to obtain accuracy using technology that yields short reads.

We construct the unique N-mer frequency profiles of 635 microbial genomes publicly available as of February 2008.

These profiles are used to train a naive Bayes classifier (NBC) that can be used to identify the genome of any fragment.

We show that our method is comparable to BLAST for small 25 bp fragments but does not have the ambiguity of BLAST's tied top scores.

We demonstrate that this approach is scalable to identify any fragment from hundreds of genomes.

It also performs quite well at the strain, species, and genera levels and achieves strain resolution despite classifying ubiquitous genomic fragments (gene and nongene regions).

Cross-validation analysis demonstrates that species-accuracy achieves 90% for highly-represented species containing an average of 8 strains.

We demonstrate that such a tool can be used on the Sargasso Sea dataset, and our analysis shows that NBC can be further enhanced.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Rosen, Gail L.& Garbarine, Elaine& Caseiro, Diamantino& Polikar, Robi& Sokhansanj, Bahrad. 2008. Metagenome Fragment Classification Using N-Mer Frequency Profiles. Advances in Bioinformatics،Vol. 2008, no. 2008, pp.1-12.
https://search.emarefa.net/detail/BIM-454393

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Rosen, Gail L.…[et al.]. Metagenome Fragment Classification Using N-Mer Frequency Profiles. Advances in Bioinformatics No. 2008 (2008), pp.1-12.
https://search.emarefa.net/detail/BIM-454393

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Rosen, Gail L.& Garbarine, Elaine& Caseiro, Diamantino& Polikar, Robi& Sokhansanj, Bahrad. Metagenome Fragment Classification Using N-Mer Frequency Profiles. Advances in Bioinformatics. 2008. Vol. 2008, no. 2008, pp.1-12.
https://search.emarefa.net/detail/BIM-454393

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references

رقم السجل

BIM-454393