File Download
 
Links for fulltext
(May Require Subscription)
 
Supplementary

Article: A robust and accurate binning algorithm for metagenomic sequences with arbitrary species abundance ratio
  • Basic View
  • Metadata View
  • XML View
TitleA robust and accurate binning algorithm for metagenomic sequences with arbitrary species abundance ratio
 
AuthorsLeung, HCM1
Yiu, SM1
Yang, B1
Peng, Y1
Wang, Y1
Liu, Z1
Chen, J1
Qin, J1
Li, R
Chin, FYL1
 
Issue Date2011
 
PublisherOxford University Press. The Journal's web site is located at http://bioinformatics.oxfordjournals.org/
 
CitationBioinformatics, 2011, v. 27 n. 11, p. 1489-1495 [How to Cite?]
DOI: http://dx.doi.org/10.1093/bioinformatics/btr186
 
AbstractMotivation: With the rapid development of next-generation sequencing techniques, metagenomics, also known as environmental genomics, has emerged as an exciting research area that enables us to analyze the microbial environment in which we live. An important step for metagenomic data analysis is the identification and taxonomic characterization of DNA fragments (reads or contigs) resulting from sequencing a sample of mixed species. This step is referred to as 'binning'. Binning algorithms that are based on sequence similarity and sequence composition markers rely heavily on the reference genomes of known microorganisms or phylogenetic markers. Due to the limited availability of reference genomes and the bias and low availability of markers, these algorithms may not be applicable in all cases. Unsupervised binning algorithms which can handle fragments from unknown species provide an alternative approach. However, existing unsupervised binning algorithms only work on datasets either with balanced species abundance ratios or rather different abundance ratios, but not both. Results: In this article, we present MetaCluster 3.0, an integrated binning method based on the unsupervised top-down separation and bottom-up merging strategy, which can bin metagenomic fragments of species with very balanced abundance ratios (say 1:1) to very different abundance ratios (e.g. 1:24) with consistently higher accuracy than existing methods. © The Author 2011. Published by Oxford University Press. All rights reserved.
 
ISSN1367-4803
2012 Impact Factor: 5.323
2012 SCImago Journal Rankings: 4.223
 
DOIhttp://dx.doi.org/10.1093/bioinformatics/btr186
 
ISI Accession Number IDWOS:000291062400007
Funding AgencyGrant Number
GRFHKU 719709E
HKU 711611
Funding Information:

GRF grant (HKU 719709E, HKU 711611 and HKU SPACE Research Fund) in part.

 
ReferencesReferences in Scopus
 
GrantsAlgorithms for Inferring k-articulated Phylogenetic Network
 
DC FieldValue
dc.contributor.authorLeung, HCM
 
dc.contributor.authorYiu, SM
 
dc.contributor.authorYang, B
 
dc.contributor.authorPeng, Y
 
dc.contributor.authorWang, Y
 
dc.contributor.authorLiu, Z
 
dc.contributor.authorChen, J
 
dc.contributor.authorQin, J
 
dc.contributor.authorLi, R
 
dc.contributor.authorChin, FYL
 
dc.date.accessioned2011-09-23T06:19:25Z
 
dc.date.available2011-09-23T06:19:25Z
 
dc.date.issued2011
 
dc.description.abstractMotivation: With the rapid development of next-generation sequencing techniques, metagenomics, also known as environmental genomics, has emerged as an exciting research area that enables us to analyze the microbial environment in which we live. An important step for metagenomic data analysis is the identification and taxonomic characterization of DNA fragments (reads or contigs) resulting from sequencing a sample of mixed species. This step is referred to as 'binning'. Binning algorithms that are based on sequence similarity and sequence composition markers rely heavily on the reference genomes of known microorganisms or phylogenetic markers. Due to the limited availability of reference genomes and the bias and low availability of markers, these algorithms may not be applicable in all cases. Unsupervised binning algorithms which can handle fragments from unknown species provide an alternative approach. However, existing unsupervised binning algorithms only work on datasets either with balanced species abundance ratios or rather different abundance ratios, but not both. Results: In this article, we present MetaCluster 3.0, an integrated binning method based on the unsupervised top-down separation and bottom-up merging strategy, which can bin metagenomic fragments of species with very balanced abundance ratios (say 1:1) to very different abundance ratios (e.g. 1:24) with consistently higher accuracy than existing methods. © The Author 2011. Published by Oxford University Press. All rights reserved.
 
dc.description.naturepostprint
 
dc.identifier.citationBioinformatics, 2011, v. 27 n. 11, p. 1489-1495 [How to Cite?]
DOI: http://dx.doi.org/10.1093/bioinformatics/btr186
 
dc.identifier.citeulike9157005
 
dc.identifier.doihttp://dx.doi.org/10.1093/bioinformatics/btr186
 
dc.identifier.eissn1460-2059
 
dc.identifier.epage1495
 
dc.identifier.hkuros192228
 
dc.identifier.isiWOS:000291062400007
Funding AgencyGrant Number
GRFHKU 719709E
HKU 711611
Funding Information:

GRF grant (HKU 719709E, HKU 711611 and HKU SPACE Research Fund) in part.

 
dc.identifier.issn1367-4803
2012 Impact Factor: 5.323
2012 SCImago Journal Rankings: 4.223
 
dc.identifier.issue11
 
dc.identifier.openurl
 
dc.identifier.pmid21493653
 
dc.identifier.scopuseid_2-s2.0-79957877228
 
dc.identifier.spage1489
 
dc.identifier.urihttp://hdl.handle.net/10722/140792
 
dc.identifier.volume27
 
dc.languageeng
 
dc.publisherOxford University Press. The Journal's web site is located at http://bioinformatics.oxfordjournals.org/
 
dc.publisher.placeUnited Kingdom
 
dc.relation.ispartofBioinformatics
 
dc.relation.projectAlgorithms for Inferring k-articulated Phylogenetic Network
 
dc.relation.referencesReferences in Scopus
 
dc.rightsThis is a pre-copy-editing, author-produced PDF of an article accepted for publication in Bioinformatics following peer review. The definitive publisher-authenticated version Bioinformatics, 2011, v. 27 n. 11, p. 1489-1495 is available online at: http://bioinformatics.oxfordjournals.org/content/27/11/1489
 
dc.rightsCreative Commons: Attribution 3.0 Hong Kong License
 
dc.subject.meshAlgorithms
 
dc.subject.meshCluster Analysis
 
dc.subject.meshMetagenomics - methods
 
dc.subject.meshSequence Analysis, DNA
 
dc.titleA robust and accurate binning algorithm for metagenomic sequences with arbitrary species abundance ratio
 
dc.typeArticle
 
<?xml encoding="utf-8" version="1.0"?>
<item><contributor.author>Leung, HCM</contributor.author>
<contributor.author>Yiu, SM</contributor.author>
<contributor.author>Yang, B</contributor.author>
<contributor.author>Peng, Y</contributor.author>
<contributor.author>Wang, Y</contributor.author>
<contributor.author>Liu, Z</contributor.author>
<contributor.author>Chen, J</contributor.author>
<contributor.author>Qin, J</contributor.author>
<contributor.author>Li, R</contributor.author>
<contributor.author>Chin, FYL</contributor.author>
<date.accessioned>2011-09-23T06:19:25Z</date.accessioned>
<date.available>2011-09-23T06:19:25Z</date.available>
<date.issued>2011</date.issued>
<identifier.citation>Bioinformatics, 2011, v. 27 n. 11, p. 1489-1495</identifier.citation>
<identifier.issn>1367-4803</identifier.issn>
<identifier.uri>http://hdl.handle.net/10722/140792</identifier.uri>
<description.abstract>Motivation: With the rapid development of next-generation sequencing techniques, metagenomics, also known as environmental genomics, has emerged as an exciting research area that enables us to analyze the microbial environment in which we live. An important step for metagenomic data analysis is the identification and taxonomic characterization of DNA fragments (reads or contigs) resulting from sequencing a sample of mixed species. This step is referred to as &apos;binning&apos;. Binning algorithms that are based on sequence similarity and sequence composition markers rely heavily on the reference genomes of known microorganisms or phylogenetic markers. Due to the limited availability of reference genomes and the bias and low availability of markers, these algorithms may not be applicable in all cases. Unsupervised binning algorithms which can handle fragments from unknown species provide an alternative approach. However, existing unsupervised binning algorithms only work on datasets either with balanced species abundance ratios or rather different abundance ratios, but not both. Results: In this article, we present MetaCluster 3.0, an integrated binning method based on the unsupervised top-down separation and bottom-up merging strategy, which can bin metagenomic fragments of species with very balanced abundance ratios (say 1:1) to very different abundance ratios (e.g. 1:24) with consistently higher accuracy than existing methods. &#169; The Author 2011. Published by Oxford University Press. All rights reserved.</description.abstract>
<language>eng</language>
<publisher>Oxford University Press. The Journal&apos;s web site is located at http://bioinformatics.oxfordjournals.org/</publisher>
<relation.ispartof>Bioinformatics</relation.ispartof>
<rights>This is a pre-copy-editing, author-produced PDF of an article accepted for publication in Bioinformatics following peer review. The definitive publisher-authenticated version Bioinformatics, 2011, v. 27 n. 11, p. 1489-1495 is available online at: http://bioinformatics.oxfordjournals.org/content/27/11/1489</rights>
<rights>Creative Commons: Attribution 3.0 Hong Kong License</rights>
<subject.mesh>Algorithms</subject.mesh>
<subject.mesh>Cluster Analysis</subject.mesh>
<subject.mesh>Metagenomics - methods</subject.mesh>
<subject.mesh>Sequence Analysis, DNA</subject.mesh>
<title>A robust and accurate binning algorithm for metagenomic sequences with arbitrary species abundance ratio</title>
<type>Article</type>
<identifier.openurl>http://library.hku.hk:4550/resserv?sid=HKU:IR&amp;issn=1367-4803&amp;volume=27&amp;issue=11&amp;spage=1489&amp;epage=1495&amp;date=2011&amp;atitle=A+robust+and+accurate+binning+algorithm+for+metagenomic+sequences+with+arbitrary+species+abundance+ratio</identifier.openurl>
<description.nature>postprint</description.nature>
<identifier.doi>10.1093/bioinformatics/btr186</identifier.doi>
<identifier.pmid>21493653</identifier.pmid>
<identifier.scopus>eid_2-s2.0-79957877228</identifier.scopus>
<identifier.hkuros>192228</identifier.hkuros>
<relation.references>http://www.scopus.com/mlt/select.url?eid=2-s2.0-79957877228&amp;selection=ref&amp;src=s&amp;origin=recordpage</relation.references>
<identifier.volume>27</identifier.volume>
<identifier.issue>11</identifier.issue>
<identifier.spage>1489</identifier.spage>
<identifier.epage>1495</identifier.epage>
<identifier.eissn>1460-2059</identifier.eissn>
<identifier.isi>WOS:000291062400007</identifier.isi>
<publisher.place>United Kingdom</publisher.place>
<relation.project>Algorithms for Inferring k-articulated Phylogenetic Network</relation.project>
<identifier.citeulike>9157005</identifier.citeulike>
<bitstream.url>http://hub.hku.hk/bitstream/10722/140792/1/Content.pdf</bitstream.url>
</item>
Author Affiliations
  1. The University of Hong Kong