File Download
 
Links for fulltext
(May Require Subscription)
 
Supplementary

Conference Paper: IDBA - A practical iterative De Bruijn graph De Novo assembler
  • Basic View
  • Metadata View
  • XML View
TitleIDBA - A practical iterative De Bruijn graph De Novo assembler
 
AuthorsPeng, Y1
Leung, HCM1
Yiu, SM1
Chin, FYL1
 
KeywordsDe Bruijn graph
De Novo assembly
High throughput short reads
Mate-pair
String graph
 
Issue Date2010
 
PublisherSpringer Verlag. The Journal's web site is located at http://springerlink.com/content/105633/
 
CitationThe 14th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2010), Lisbon, Portugal, 25-28 April 2010. In Lecture Notes in Computer Science, 2010, v. 6044, p. 426-440 [How to Cite?]
DOI: http://dx.doi.org/10.1007/978-3-642-12683-3_28
 
AbstractThe de Bruijn graph assembly approach breaks reads into k-mers before assembling them into contigs. The string graph approach forms contigs by connecting two reads with k or more overlapping nucleotides. Both approaches must deal with the following problems: false-positive vertices, due to erroneous reads; gap problem, due to non-uniform coverage; branching problem, due to erroneous reads and repeat regions. A proper choice of k is crucial but for single k there is always a trade-off: a small k favors the situation of erroneous reads and non-uniform coverage, and a large k favors short repeat regions. We propose an iterative de Bruijn graph approach iterating from small to large k exploring the advantages of the in between values. Our IDBA outperforms the existing algorithms by constructing longer contigs with similar accuracy and using less memory, both with real and simulated data. The running time of the algorithm is comparable to existing algorithms. © Springer-Verlag Berlin Heidelberg 2010.
 
DescriptionLNCS v. 6044 is conference proceedings of 14th RECOMB 2010
 
ISSN0302-9743
2013 SCImago Journal Rankings: 0.310
 
DOIhttp://dx.doi.org/10.1007/978-3-642-12683-3_28
 
ReferencesReferences in Scopus
 
DC FieldValue
dc.contributor.authorPeng, Y
 
dc.contributor.authorLeung, HCM
 
dc.contributor.authorYiu, SM
 
dc.contributor.authorChin, FYL
 
dc.date.accessioned2010-12-23T08:39:23Z
 
dc.date.available2010-12-23T08:39:23Z
 
dc.date.issued2010
 
dc.description.abstractThe de Bruijn graph assembly approach breaks reads into k-mers before assembling them into contigs. The string graph approach forms contigs by connecting two reads with k or more overlapping nucleotides. Both approaches must deal with the following problems: false-positive vertices, due to erroneous reads; gap problem, due to non-uniform coverage; branching problem, due to erroneous reads and repeat regions. A proper choice of k is crucial but for single k there is always a trade-off: a small k favors the situation of erroneous reads and non-uniform coverage, and a large k favors short repeat regions. We propose an iterative de Bruijn graph approach iterating from small to large k exploring the advantages of the in between values. Our IDBA outperforms the existing algorithms by constructing longer contigs with similar accuracy and using less memory, both with real and simulated data. The running time of the algorithm is comparable to existing algorithms. © Springer-Verlag Berlin Heidelberg 2010.
 
dc.description.naturepostprint
 
dc.descriptionLNCS v. 6044 is conference proceedings of 14th RECOMB 2010
 
dc.description.otherThe 14th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2010), Lisbon, Portugal, 25-28 April 2010. In Lecture Notes in Computer Science, 2010, v. 6044, p. 426-440
 
dc.identifier.citationThe 14th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2010), Lisbon, Portugal, 25-28 April 2010. In Lecture Notes in Computer Science, 2010, v. 6044, p. 426-440 [How to Cite?]
DOI: http://dx.doi.org/10.1007/978-3-642-12683-3_28
 
dc.identifier.citeulike7896392
 
dc.identifier.doihttp://dx.doi.org/10.1007/978-3-642-12683-3_28
 
dc.identifier.epage440
 
dc.identifier.hkuros178332
 
dc.identifier.hkuros169727
 
dc.identifier.issn0302-9743
2013 SCImago Journal Rankings: 0.310
 
dc.identifier.scopuseid_2-s2.0-78650270346
 
dc.identifier.spage426
 
dc.identifier.urihttp://hdl.handle.net/10722/129571
 
dc.identifier.volume6044 LNBI
 
dc.languageeng
 
dc.publisherSpringer Verlag. The Journal's web site is located at http://springerlink.com/content/105633/
 
dc.publisher.placeGermany
 
dc.relation.ispartofLecture Notes in Computer Science
 
dc.relation.referencesReferences in Scopus
 
dc.rightsThe original publication is available at www.springerlink.com
 
dc.rightsCreative Commons: Attribution 3.0 Hong Kong License
 
dc.subjectDe Bruijn graph
 
dc.subjectDe Novo assembly
 
dc.subjectHigh throughput short reads
 
dc.subjectMate-pair
 
dc.subjectString graph
 
dc.titleIDBA - A practical iterative De Bruijn graph De Novo assembler
 
dc.typeConference_Paper
 
<?xml encoding="utf-8" version="1.0"?>
<item><contributor.author>Peng, Y</contributor.author>
<contributor.author>Leung, HCM</contributor.author>
<contributor.author>Yiu, SM</contributor.author>
<contributor.author>Chin, FYL</contributor.author>
<date.accessioned>2010-12-23T08:39:23Z</date.accessioned>
<date.available>2010-12-23T08:39:23Z</date.available>
<date.issued>2010</date.issued>
<identifier.citation>The 14th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2010), Lisbon, Portugal, 25-28 April 2010. In Lecture Notes in Computer Science, 2010, v. 6044, p. 426-440</identifier.citation>
<identifier.issn>0302-9743</identifier.issn>
<identifier.uri>http://hdl.handle.net/10722/129571</identifier.uri>
<description>LNCS v. 6044 is conference proceedings of 14th RECOMB 2010</description>
<description.abstract>The de Bruijn graph assembly approach breaks reads into k-mers before assembling them into contigs. The string graph approach forms contigs by connecting two reads with k or more overlapping nucleotides. Both approaches must deal with the following problems: false-positive vertices, due to erroneous reads; gap problem, due to non-uniform coverage; branching problem, due to erroneous reads and repeat regions. A proper choice of k is crucial but for single k there is always a trade-off: a small k favors the situation of erroneous reads and non-uniform coverage, and a large k favors short repeat regions. We propose an iterative de Bruijn graph approach iterating from small to large k exploring the advantages of the in between values. Our IDBA outperforms the existing algorithms by constructing longer contigs with similar accuracy and using less memory, both with real and simulated data. The running time of the algorithm is comparable to existing algorithms. &#169; Springer-Verlag Berlin Heidelberg 2010.</description.abstract>
<language>eng</language>
<publisher>Springer Verlag. The Journal&apos;s web site is located at http://springerlink.com/content/105633/</publisher>
<relation.ispartof>Lecture Notes in Computer Science</relation.ispartof>
<rights>The original publication is available at www.springerlink.com</rights>
<rights>Creative Commons: Attribution 3.0 Hong Kong License</rights>
<subject>De Bruijn graph</subject>
<subject>De Novo assembly</subject>
<subject>High throughput short reads</subject>
<subject>Mate-pair</subject>
<subject>String graph</subject>
<title>IDBA - A practical iterative De Bruijn graph De Novo assembler</title>
<type>Conference_Paper</type>
<description.nature>postprint</description.nature>
<identifier.doi>10.1007/978-3-642-12683-3_28</identifier.doi>
<identifier.scopus>eid_2-s2.0-78650270346</identifier.scopus>
<identifier.hkuros>178332</identifier.hkuros>
<identifier.hkuros>169727</identifier.hkuros>
<relation.references>http://www.scopus.com/mlt/select.url?eid=2-s2.0-78650270346&amp;selection=ref&amp;src=s&amp;origin=recordpage</relation.references>
<identifier.volume>6044 LNBI</identifier.volume>
<identifier.spage>426</identifier.spage>
<identifier.epage>440</identifier.epage>
<publisher.place>Germany</publisher.place>
<description.other>The 14th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2010), Lisbon, Portugal, 25-28 April 2010. In Lecture Notes in Computer Science, 2010, v. 6044, p. 426-440</description.other>
<identifier.citeulike>7896392</identifier.citeulike>
<bitstream.url>http://hub.hku.hk/bitstream/10722/129571/1/Content.pdf</bitstream.url>
</item>
Author Affiliations
  1. The University of Hong Kong