Article: IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth

File Download Links for fulltext
(May Require Subscription)
Supplementary

  • Basic View
  • Metadata View
  • XML View
TitleIDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth
AuthorsPeng, Y1
Leung, HCM1
Yiu, SM1
Chin, FYL1
Issue Date2012
PublisherOxford University Press. The Journal's web site is located at http://bioinformatics.oxfordjournals.org/
CitationBioinformatics, 2012, v. 28 n. 11, p. 1420-1428 [How to Cite?]
DOI: http://dx.doi.org/10.1093/bioinformatics/bts174
AbstractMOTIVATION: Next-generation sequencing allows us to sequence reads from a microbial environment using single-cell sequencing or metagenomic sequencing technologies. However, both technologies suffer from the problem that sequencing depth of different regions of a genome or genomes from different species are highly uneven. Most existing genome assemblers usually have an assumption that sequencing depths are even. These assemblers fail to construct correct long contigs. RESULTS: We introduce the IDBA-UD algorithm that is based on the de Bruijn graph approach for assembling reads from single-cell sequencing or metagenomic sequencing technologies with uneven sequencing depths. Several non-trivial techniques have been employed to tackle the problems. Instead of using a simple threshold, we use multiple depthrelative thresholds to remove erroneous k-mers in both low-depth and high-depth regions. The technique of local assembly with paired-end information is used to solve the branch problem of low-depth short repeat regions. To speed up the process, an error correction step is conducted to correct reads of high-depth regions that can be aligned to highconfident contigs. Comparison of the performances of IDBA-UD and existing assemblers (Velvet, Velvet-SC, SOAPdenovo and Meta-IDBA) for different datasets, shows that IDBA-UD can reconstruct longer contigs with higher accuracy. AVAILABILITY: The IDBA-UD toolkit is available at our website http://www.cs.hku.hk/alse/idba_ud
ISSN1367-4803
2011 Impact Factor: 5.468
2011 SCImago Journal Rankings: 1.118
DOIhttp://dx.doi.org/10.1093/bioinformatics/bts174
ReferencesReferences in Scopus
GrantsAlgorithms for Inferring k-articulated Phylogenetic Network
DC Field
Value
dc.contributor.authorPeng, Y
dc.contributor.authorLeung, HCM
dc.contributor.authorYiu, SM
dc.contributor.authorChin, FYL
dc.date.accessioned2012-06-26T06:39:47Z
dc.date.available2012-06-26T06:39:47Z
dc.date.issued2012
dc.description.abstractMOTIVATION: Next-generation sequencing allows us to sequence reads from a microbial environment using single-cell sequencing or metagenomic sequencing technologies. However, both technologies suffer from the problem that sequencing depth of different regions of a genome or genomes from different species are highly uneven. Most existing genome assemblers usually have an assumption that sequencing depths are even. These assemblers fail to construct correct long contigs. RESULTS: We introduce the IDBA-UD algorithm that is based on the de Bruijn graph approach for assembling reads from single-cell sequencing or metagenomic sequencing technologies with uneven sequencing depths. Several non-trivial techniques have been employed to tackle the problems. Instead of using a simple threshold, we use multiple depthrelative thresholds to remove erroneous k-mers in both low-depth and high-depth regions. The technique of local assembly with paired-end information is used to solve the branch problem of low-depth short repeat regions. To speed up the process, an error correction step is conducted to correct reads of high-depth regions that can be aligned to highconfident contigs. Comparison of the performances of IDBA-UD and existing assemblers (Velvet, Velvet-SC, SOAPdenovo and Meta-IDBA) for different datasets, shows that IDBA-UD can reconstruct longer contigs with higher accuracy. AVAILABILITY: The IDBA-UD toolkit is available at our website http://www.cs.hku.hk/alse/idba_ud
dc.description.grantAlgorithms for Inferring k-articulated Phylogenetic Network
dc.description.grantcode100580
dc.description.natureLink_to_subscribed_fulltext
dc.identifier.citationBioinformatics, 2012, v. 28 n. 11, p. 1420-1428 [How to Cite?]
DOI: http://dx.doi.org/10.1093/bioinformatics/bts174
dc.identifier.citeulike10559166
dc.identifier.doihttp://dx.doi.org/10.1093/bioinformatics/bts174
dc.identifier.epage1428
dc.identifier.hkuros202752
dc.identifier.isiWOS:000304537000002
Funding AgencyGrant Number
HKGRFHKU 7116/08E
HKU 719709E
HKU Genomics SRT
Funding Information:

This work was supported in part by HKGRF funding (HKU 7116/08E, HKU 719709E) and HKU Genomics SRT funding.

dc.identifier.issn1367-4803
2011 Impact Factor: 5.468
2011 SCImago Journal Rankings: 1.118
dc.identifier.issue11
dc.identifier.pmid22495754
dc.identifier.scopuseid_2-s2.0-84861760530
dc.identifier.spage1420
dc.identifier.urihttp://hdl.handle.net/10722/152505
dc.identifier.volume28
dc.languageeng
dc.publisherOxford University Press. The Journal's web site is located at http://bioinformatics.oxfordjournals.org/
dc.publisher.placeUnited Kingdom
dc.relation.ispartofBioinformatics
dc.relation.referencesReferences in Scopus
dc.titleIDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth
dc.typeArticle
Author Affiliations
  1. The University of Hong Kong