File Download
 
Links for fulltext
(May Require Subscription)
 
Supplementary

Article: IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth
  • Basic View
  • Metadata View
  • XML View
TitleIDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth
 
AuthorsPeng, Y1
Leung, HCM1
Yiu, SM1
Chin, FYL1
 
Issue Date2012
 
PublisherOxford University Press. The Journal's web site is located at http://bioinformatics.oxfordjournals.org/
 
CitationBioinformatics, 2012, v. 28 n. 11, p. 1420-1428 [How to Cite?]
DOI: http://dx.doi.org/10.1093/bioinformatics/bts174
 
AbstractMOTIVATION: Next-generation sequencing allows us to sequence reads from a microbial environment using single-cell sequencing or metagenomic sequencing technologies. However, both technologies suffer from the problem that sequencing depth of different regions of a genome or genomes from different species are highly uneven. Most existing genome assemblers usually have an assumption that sequencing depths are even. These assemblers fail to construct correct long contigs. RESULTS: We introduce the IDBA-UD algorithm that is based on the de Bruijn graph approach for assembling reads from single-cell sequencing or metagenomic sequencing technologies with uneven sequencing depths. Several non-trivial techniques have been employed to tackle the problems. Instead of using a simple threshold, we use multiple depthrelative thresholds to remove erroneous k-mers in both low-depth and high-depth regions. The technique of local assembly with paired-end information is used to solve the branch problem of low-depth short repeat regions. To speed up the process, an error correction step is conducted to correct reads of high-depth regions that can be aligned to highconfident contigs. Comparison of the performances of IDBA-UD and existing assemblers (Velvet, Velvet-SC, SOAPdenovo and Meta-IDBA) for different datasets, shows that IDBA-UD can reconstruct longer contigs with higher accuracy. AVAILABILITY: The IDBA-UD toolkit is available at our website http://www.cs.hku.hk/alse/idba_ud
 
ISSN1367-4803
2013 Impact Factor: 4.621
 
DOIhttp://dx.doi.org/10.1093/bioinformatics/bts174
 
ISI Accession Number IDWOS:000304537000002
Funding AgencyGrant Number
HKGRFHKU 7116/08E
HKU 719709E
HKU Genomics SRT
Funding Information:

This work was supported in part by HKGRF funding (HKU 7116/08E, HKU 719709E) and HKU Genomics SRT funding.

 
ReferencesReferences in Scopus
 
GrantsAlgorithms for Inferring k-articulated Phylogenetic Network
 
DC FieldValue
dc.contributor.authorPeng, Y
 
dc.contributor.authorLeung, HCM
 
dc.contributor.authorYiu, SM
 
dc.contributor.authorChin, FYL
 
dc.date.accessioned2012-06-26T06:39:47Z
 
dc.date.available2012-06-26T06:39:47Z
 
dc.date.issued2012
 
dc.description.abstractMOTIVATION: Next-generation sequencing allows us to sequence reads from a microbial environment using single-cell sequencing or metagenomic sequencing technologies. However, both technologies suffer from the problem that sequencing depth of different regions of a genome or genomes from different species are highly uneven. Most existing genome assemblers usually have an assumption that sequencing depths are even. These assemblers fail to construct correct long contigs. RESULTS: We introduce the IDBA-UD algorithm that is based on the de Bruijn graph approach for assembling reads from single-cell sequencing or metagenomic sequencing technologies with uneven sequencing depths. Several non-trivial techniques have been employed to tackle the problems. Instead of using a simple threshold, we use multiple depthrelative thresholds to remove erroneous k-mers in both low-depth and high-depth regions. The technique of local assembly with paired-end information is used to solve the branch problem of low-depth short repeat regions. To speed up the process, an error correction step is conducted to correct reads of high-depth regions that can be aligned to highconfident contigs. Comparison of the performances of IDBA-UD and existing assemblers (Velvet, Velvet-SC, SOAPdenovo and Meta-IDBA) for different datasets, shows that IDBA-UD can reconstruct longer contigs with higher accuracy. AVAILABILITY: The IDBA-UD toolkit is available at our website http://www.cs.hku.hk/alse/idba_ud
 
dc.description.naturepostprint
 
dc.identifier.citationBioinformatics, 2012, v. 28 n. 11, p. 1420-1428 [How to Cite?]
DOI: http://dx.doi.org/10.1093/bioinformatics/bts174
 
dc.identifier.citeulike10559166
 
dc.identifier.doihttp://dx.doi.org/10.1093/bioinformatics/bts174
 
dc.identifier.eissn1460-2059
 
dc.identifier.epage1428
 
dc.identifier.hkuros202752
 
dc.identifier.isiWOS:000304537000002
Funding AgencyGrant Number
HKGRFHKU 7116/08E
HKU 719709E
HKU Genomics SRT
Funding Information:

This work was supported in part by HKGRF funding (HKU 7116/08E, HKU 719709E) and HKU Genomics SRT funding.

 
dc.identifier.issn1367-4803
2013 Impact Factor: 4.621
 
dc.identifier.issue11
 
dc.identifier.pmid22495754
 
dc.identifier.scopuseid_2-s2.0-84861760530
 
dc.identifier.spage1420
 
dc.identifier.urihttp://hdl.handle.net/10722/152505
 
dc.identifier.volume28
 
dc.languageeng
 
dc.publisherOxford University Press. The Journal's web site is located at http://bioinformatics.oxfordjournals.org/
 
dc.publisher.placeUnited Kingdom
 
dc.relation.ispartofBioinformatics
 
dc.relation.projectAlgorithms for Inferring k-articulated Phylogenetic Network
 
dc.relation.referencesReferences in Scopus
 
dc.rightsThis is a pre-copy-editing, author-produced PDF of an article accepted for publication in Bioinformatics following peer review. The definitive publisher-authenticated version Bioinformatics, 2012, v. 28 n. 11, p. 1420-1428 is available online at: http://bioinformatics.oxfordjournals.org/content/28/11/1420
 
dc.rightsCreative Commons: Attribution 3.0 Hong Kong License
 
dc.titleIDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth
 
dc.typeArticle
 
<?xml encoding="utf-8" version="1.0"?>
<item><contributor.author>Peng, Y</contributor.author>
<contributor.author>Leung, HCM</contributor.author>
<contributor.author>Yiu, SM</contributor.author>
<contributor.author>Chin, FYL</contributor.author>
<date.accessioned>2012-06-26T06:39:47Z</date.accessioned>
<date.available>2012-06-26T06:39:47Z</date.available>
<date.issued>2012</date.issued>
<identifier.citation>Bioinformatics, 2012, v. 28 n. 11, p. 1420-1428</identifier.citation>
<identifier.issn>1367-4803</identifier.issn>
<identifier.uri>http://hdl.handle.net/10722/152505</identifier.uri>
<description.abstract>MOTIVATION: Next-generation sequencing allows us to sequence reads from a microbial environment using single-cell sequencing or metagenomic sequencing technologies. However, both technologies suffer from the problem that sequencing depth of different regions of a genome or genomes from different species are highly uneven. Most existing genome assemblers usually have an assumption that sequencing depths are even. These assemblers fail to construct correct long contigs. RESULTS: We introduce the IDBA-UD algorithm that is based on the de Bruijn graph approach for assembling reads from single-cell sequencing or metagenomic sequencing technologies with uneven sequencing depths. Several non-trivial techniques have been employed to tackle the problems. Instead of using a simple threshold, we use multiple depthrelative thresholds to remove erroneous k-mers in both low-depth and high-depth regions. The technique of local assembly with paired-end information is used to solve the branch problem of low-depth short repeat regions. To speed up the process, an error correction step is conducted to correct reads of high-depth regions that can be aligned to highconfident contigs. Comparison of the performances of IDBA-UD and existing assemblers (Velvet, Velvet-SC, SOAPdenovo and Meta-IDBA) for different datasets, shows that IDBA-UD can reconstruct longer contigs with higher accuracy. AVAILABILITY: The IDBA-UD toolkit is available at our website http://www.cs.hku.hk/alse/idba_ud</description.abstract>
<language>eng</language>
<publisher>Oxford University Press. The Journal&apos;s web site is located at http://bioinformatics.oxfordjournals.org/</publisher>
<relation.ispartof>Bioinformatics</relation.ispartof>
<rights>This is a pre-copy-editing, author-produced PDF of an article accepted for publication in Bioinformatics following peer review. The definitive publisher-authenticated version Bioinformatics, 2012, v. 28 n. 11, p. 1420-1428 is available online at: http://bioinformatics.oxfordjournals.org/content/28/11/1420</rights>
<rights>Creative Commons: Attribution 3.0 Hong Kong License</rights>
<title>IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth</title>
<type>Article</type>
<description.nature>postprint</description.nature>
<identifier.doi>10.1093/bioinformatics/bts174</identifier.doi>
<identifier.pmid>22495754</identifier.pmid>
<identifier.scopus>eid_2-s2.0-84861760530</identifier.scopus>
<identifier.hkuros>202752</identifier.hkuros>
<relation.references>http://www.scopus.com/mlt/select.url?eid=2-s2.0-84861760530&amp;selection=ref&amp;src=s&amp;origin=recordpage</relation.references>
<identifier.volume>28</identifier.volume>
<identifier.issue>11</identifier.issue>
<identifier.spage>1420</identifier.spage>
<identifier.epage>1428</identifier.epage>
<identifier.eissn>1460-2059</identifier.eissn>
<identifier.isi>WOS:000304537000002</identifier.isi>
<publisher.place>United Kingdom</publisher.place>
<relation.project>Algorithms for Inferring k-articulated Phylogenetic Network</relation.project>
<identifier.citeulike>10559166</identifier.citeulike>
<bitstream.url>http://hub.hku.hk/bitstream/10722/152505/1/Content.pdf</bitstream.url>
</item>
Author Affiliations
  1. The University of Hong Kong