Conference Paper: IDBA - A practical iterative De Bruijn graph De Novo assembler

File Download Links for fulltext
(May Require Subscription)
Supplementary
  • Basic View
  • Metadata View
  • XML View
TitleIDBA - A practical iterative De Bruijn graph De Novo assembler
AuthorsPeng, Y1
Leung, HCM1
Yiu, SM1
Chin, FYL1
KeywordsDe Bruijn graph
De Novo assembly
High throughput short reads
Mate-pair
String graph
Issue Date2010
PublisherSpringer Verlag. The Journal's web site is located at http://springerlink.com/content/105633/
CitationThe 14th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2010), Lisbon, Portugal, 25-28 April 2010. In Lecture Notes in Computer Science, 2010, v. 6044, p. 426-440 [How to Cite?]
DOI: http://dx.doi.org/10.1007/978-3-642-12683-3_28
AbstractThe de Bruijn graph assembly approach breaks reads into k-mers before assembling them into contigs. The string graph approach forms contigs by connecting two reads with k or more overlapping nucleotides. Both approaches must deal with the following problems: false-positive vertices, due to erroneous reads; gap problem, due to non-uniform coverage; branching problem, due to erroneous reads and repeat regions. A proper choice of k is crucial but for single k there is always a trade-off: a small k favors the situation of erroneous reads and non-uniform coverage, and a large k favors short repeat regions. We propose an iterative de Bruijn graph approach iterating from small to large k exploring the advantages of the in between values. Our IDBA outperforms the existing algorithms by constructing longer contigs with similar accuracy and using less memory, both with real and simulated data. The running time of the algorithm is comparable to existing algorithms. © Springer-Verlag Berlin Heidelberg 2010.
DescriptionLNCS v. 6044 is conference proceedings of 14th RECOMB 2010
ISSN0302-9743
2011 SCImago Journal Rankings: 0.034
DOIhttp://dx.doi.org/10.1007/978-3-642-12683-3_28
ReferencesReferences in Scopus
DC Field
Value
dc.contributor.authorPeng, Y
dc.contributor.authorLeung, HCM
dc.contributor.authorYiu, SM
dc.contributor.authorChin, FYL
dc.date.accessioned2010-12-23T08:39:23Z
dc.date.available2010-12-23T08:39:23Z
dc.date.issued2010
dc.description.abstractThe de Bruijn graph assembly approach breaks reads into k-mers before assembling them into contigs. The string graph approach forms contigs by connecting two reads with k or more overlapping nucleotides. Both approaches must deal with the following problems: false-positive vertices, due to erroneous reads; gap problem, due to non-uniform coverage; branching problem, due to erroneous reads and repeat regions. A proper choice of k is crucial but for single k there is always a trade-off: a small k favors the situation of erroneous reads and non-uniform coverage, and a large k favors short repeat regions. We propose an iterative de Bruijn graph approach iterating from small to large k exploring the advantages of the in between values. Our IDBA outperforms the existing algorithms by constructing longer contigs with similar accuracy and using less memory, both with real and simulated data. The running time of the algorithm is comparable to existing algorithms. © Springer-Verlag Berlin Heidelberg 2010.
dc.description.naturepostprint
dc.descriptionLNCS v. 6044 is conference proceedings of 14th RECOMB 2010
dc.description.otherThe 14th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2010), Lisbon, Portugal, 25-28 April 2010. In Lecture Notes in Computer Science, 2010, v. 6044, p. 426-440
dc.identifier.citationThe 14th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2010), Lisbon, Portugal, 25-28 April 2010. In Lecture Notes in Computer Science, 2010, v. 6044, p. 426-440 [How to Cite?]
DOI: http://dx.doi.org/10.1007/978-3-642-12683-3_28
dc.identifier.citeulike7896392
dc.identifier.doihttp://dx.doi.org/10.1007/978-3-642-12683-3_28
dc.identifier.epage440
dc.identifier.hkuros178332
dc.identifier.hkuros169727
dc.identifier.issn0302-9743
2011 SCImago Journal Rankings: 0.034
dc.identifier.scopuseid_2-s2.0-78650270346
dc.identifier.spage426
dc.identifier.urihttp://hdl.handle.net/10722/129571
dc.identifier.volume6044 LNBI
dc.languageeng
dc.publisherSpringer Verlag. The Journal's web site is located at http://springerlink.com/content/105633/
dc.publisher.placeGermany
dc.relation.ispartofLecture Notes in Computer Science
dc.relation.referencesReferences in Scopus
dc.rightsThe original publication is available at www.springerlink.com
dc.rightsCreative Commons: Attribution 3.0 Hong Kong License
dc.subjectDe Bruijn graph
dc.subjectDe Novo assembly
dc.subjectHigh throughput short reads
dc.subjectMate-pair
dc.subjectString graph
dc.titleIDBA - A practical iterative De Bruijn graph De Novo assembler
dc.typeConference_Paper
Author Affiliations
  1. The University of Hong Kong