File Download
 
Links for fulltext
(May Require Subscription)
 
Supplementary

Conference Paper: MetaCluster: unsupervised binning of environmental genomic fragments and taxonomic annotation
  • Basic View
  • Metadata View
  • XML View
TitleMetaCluster: unsupervised binning of environmental genomic fragments and taxonomic annotation
 
AuthorsYang, B3 2
Peng, Y2
Leung, HCM2
Yiu, SM2
Qin, J1
Li, R1
Chin, FYL2
 
KeywordsAlgorithms
Experimentation
Measurement
Performance
Reliability
 
Issue Date2010
 
PublisherAssociation for Computing Machinery.
 
CitationThe 1st ACM International Conference on Bioinformatics and Computational Biology (ACM-BCB 2010), Niagara Falls, N.Y., 2-4 August 2010. [How to Cite?]
DOI: http://dx.doi.org/10.1145/1854776.1854803
 
AbstractLimited by the laboratory technique, traditional microorganism research usually focuses on one single individual species. This significantly limits the deep analysis of intricate biological processes among complex microorganism communities. With the rapid development of genome sequencing techniques, the traditional research methods of microorganisms based on the isolation and cultivation are gradually replaced by metagenomics, also known as environmental genomics. The first step, which is also the major bottleneck of metagenomic data analysis, is the identification and taxonomic characterization of the DNA fragments (reads) resulting from sequencing a sample of mixed species. This step is usually referred as “binning”. Existing binning methods based on sequence similarity and sequence composition markers rely heavily on the reference genomes of known microorganisms and phylogenetic markers. Due to the limited availability of reference genomes and the bias and unstableness of markers, these methods may not be applicable in all cases. Not much unsupervised binning methods are reported, but the unsupervised nature of these methods makes them extremely difficult to annotate the clusters with taxonomic labels. In this paper, we present MetaCluster 2.0, an unsupervised binning method which could bin metagenomic sequencing datasets with high accuracy, and also identify unknown genomes and annotate them with proper taxonomic labels. The running time of MetaCluster 2.0 is at least 30 times faster than existing binning algorithms.
 
DescriptionProceedings of the 1st ACM International Conference on Bioinformatics and Computational Biology, 2010, p. 170-179
 
ISBN978-1-4503-0438-2
 
DOIhttp://dx.doi.org/10.1145/1854776.1854803
 
DC FieldValue
dc.contributor.authorYang, B
 
dc.contributor.authorPeng, Y
 
dc.contributor.authorLeung, HCM
 
dc.contributor.authorYiu, SM
 
dc.contributor.authorQin, J
 
dc.contributor.authorLi, R
 
dc.contributor.authorChin, FYL
 
dc.date.accessioned2010-12-23T08:39:29Z
 
dc.date.available2010-12-23T08:39:29Z
 
dc.date.issued2010
 
dc.description.abstractLimited by the laboratory technique, traditional microorganism research usually focuses on one single individual species. This significantly limits the deep analysis of intricate biological processes among complex microorganism communities. With the rapid development of genome sequencing techniques, the traditional research methods of microorganisms based on the isolation and cultivation are gradually replaced by metagenomics, also known as environmental genomics. The first step, which is also the major bottleneck of metagenomic data analysis, is the identification and taxonomic characterization of the DNA fragments (reads) resulting from sequencing a sample of mixed species. This step is usually referred as “binning”. Existing binning methods based on sequence similarity and sequence composition markers rely heavily on the reference genomes of known microorganisms and phylogenetic markers. Due to the limited availability of reference genomes and the bias and unstableness of markers, these methods may not be applicable in all cases. Not much unsupervised binning methods are reported, but the unsupervised nature of these methods makes them extremely difficult to annotate the clusters with taxonomic labels. In this paper, we present MetaCluster 2.0, an unsupervised binning method which could bin metagenomic sequencing datasets with high accuracy, and also identify unknown genomes and annotate them with proper taxonomic labels. The running time of MetaCluster 2.0 is at least 30 times faster than existing binning algorithms.
 
dc.description.naturepostprint
 
dc.descriptionProceedings of the 1st ACM International Conference on Bioinformatics and Computational Biology, 2010, p. 170-179
 
dc.identifier.citationThe 1st ACM International Conference on Bioinformatics and Computational Biology (ACM-BCB 2010), Niagara Falls, N.Y., 2-4 August 2010. [How to Cite?]
DOI: http://dx.doi.org/10.1145/1854776.1854803
 
dc.identifier.citeulike8820341
 
dc.identifier.doihttp://dx.doi.org/10.1145/1854776.1854803
 
dc.identifier.epage179
 
dc.identifier.hkuros177374
 
dc.identifier.isbn978-1-4503-0438-2
 
dc.identifier.scopuseid_2-s2.0-77958056824
 
dc.identifier.spage170
 
dc.identifier.urihttp://hdl.handle.net/10722/129584
 
dc.languageeng
 
dc.publisherAssociation for Computing Machinery.
 
dc.relation.ispartofInternational Conference on Bioinformatics and Computational Biology
 
dc.rightsProceedings of the 1st ACM International Conference on Bioinformatics and Computational Biology. Copyright © Association for Computing Machinery.
 
dc.subjectAlgorithms
 
dc.subjectExperimentation
 
dc.subjectMeasurement
 
dc.subjectPerformance
 
dc.subjectReliability
 
dc.titleMetaCluster: unsupervised binning of environmental genomic fragments and taxonomic annotation
 
dc.typeConference_Paper
 
<?xml encoding="utf-8" version="1.0"?>
<item><contributor.author>Yang, B</contributor.author>
<contributor.author>Peng, Y</contributor.author>
<contributor.author>Leung, HCM</contributor.author>
<contributor.author>Yiu, SM</contributor.author>
<contributor.author>Qin, J</contributor.author>
<contributor.author>Li, R</contributor.author>
<contributor.author>Chin, FYL</contributor.author>
<date.accessioned>2010-12-23T08:39:29Z</date.accessioned>
<date.available>2010-12-23T08:39:29Z</date.available>
<date.issued>2010</date.issued>
<identifier.citation>The 1st ACM International Conference on Bioinformatics and Computational Biology (ACM-BCB 2010), Niagara Falls, N.Y., 2-4 August 2010.</identifier.citation>
<identifier.isbn>978-1-4503-0438-2</identifier.isbn>
<identifier.uri>http://hdl.handle.net/10722/129584</identifier.uri>
<description>Proceedings of the 1st ACM International Conference on Bioinformatics and Computational Biology, 2010, p. 170-179</description>
<description.abstract>Limited by the laboratory technique, traditional microorganism research usually focuses on one single individual species. This significantly limits the deep analysis of intricate biological processes among complex microorganism communities. With the rapid development of genome sequencing techniques, the traditional research methods of microorganisms based on the isolation and cultivation are gradually replaced by metagenomics, also known as environmental genomics. The first step, which is also the major bottleneck of metagenomic data analysis, is the identification and taxonomic characterization of the DNA fragments (reads) resulting from sequencing a sample of mixed species. This step is usually referred as &#8220;binning&#8221;. Existing binning methods based on sequence similarity and sequence composition markers rely heavily on the reference genomes of known microorganisms and phylogenetic markers. Due to the limited availability of reference genomes and the bias and unstableness of markers, these methods may not be applicable in all cases. Not much unsupervised binning methods are reported, but the unsupervised nature of these methods makes them extremely difficult to annotate the clusters with taxonomic labels. In this paper, we present MetaCluster 2.0, an unsupervised binning method which could bin metagenomic sequencing datasets with high accuracy, and also identify unknown genomes and annotate them with proper taxonomic labels. The running time of MetaCluster 2.0 is at least 30 times faster than existing binning algorithms.</description.abstract>
<language>eng</language>
<publisher>Association for Computing Machinery.</publisher>
<relation.ispartof>International Conference on Bioinformatics and Computational Biology</relation.ispartof>
<rights>Proceedings of the 1st ACM International Conference on Bioinformatics and Computational Biology. Copyright &#169; Association for Computing Machinery.</rights>
<subject>Algorithms</subject>
<subject>Experimentation</subject>
<subject>Measurement</subject>
<subject>Performance</subject>
<subject>Reliability</subject>
<title>MetaCluster: unsupervised binning of environmental genomic fragments and taxonomic annotation</title>
<type>Conference_Paper</type>
<description.nature>postprint</description.nature>
<identifier.doi>10.1145/1854776.1854803</identifier.doi>
<identifier.scopus>eid_2-s2.0-77958056824</identifier.scopus>
<identifier.hkuros>177374</identifier.hkuros>
<identifier.spage>170</identifier.spage>
<identifier.epage>179</identifier.epage>
<identifier.citeulike>8820341</identifier.citeulike>
<bitstream.url>http://hub.hku.hk/bitstream/10722/129584/3/Content.pdf</bitstream.url>
</item>
Author Affiliations
  1. BGI-Shenzhen
  2. The University of Hong Kong
  3. Southeast University