File Download
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1093/bioinformatics/bts174
- Scopus: eid_2-s2.0-84861760530
- PMID: 22495754
- WOS: WOS:000304537000002
- Find via
Supplementary
-
Bookmarks:
- CiteULike: 11
- Citations:
- Appears in Collections:
Article: IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth
Title | IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth | ||||||
---|---|---|---|---|---|---|---|
Authors | |||||||
Issue Date | 2012 | ||||||
Publisher | Oxford University Press. The Journal's web site is located at http://bioinformatics.oxfordjournals.org/ | ||||||
Citation | Bioinformatics, 2012, v. 28 n. 11, p. 1420-1428 How to Cite? | ||||||
Abstract | MOTIVATION: Next-generation sequencing allows us to sequence reads from a microbial environment using single-cell sequencing or metagenomic sequencing technologies. However, both technologies suffer from the problem that sequencing depth of different regions of a genome or genomes from different species are highly uneven. Most existing genome assemblers usually have an assumption that sequencing depths are even. These assemblers fail to construct correct long contigs. RESULTS: We introduce the IDBA-UD algorithm that is based on the de Bruijn graph approach for assembling reads from single-cell sequencing or metagenomic sequencing technologies with uneven sequencing depths. Several non-trivial techniques have been employed to tackle the problems. Instead of using a simple threshold, we use multiple depthrelative thresholds to remove erroneous k-mers in both low-depth and high-depth regions. The technique of local assembly with paired-end information is used to solve the branch problem of low-depth short repeat regions. To speed up the process, an error correction step is conducted to correct reads of high-depth regions that can be aligned to highconfident contigs. Comparison of the performances of IDBA-UD and existing assemblers (Velvet, Velvet-SC, SOAPdenovo and Meta-IDBA) for different datasets, shows that IDBA-UD can reconstruct longer contigs with higher accuracy. AVAILABILITY: The IDBA-UD toolkit is available at our website http://www.cs.hku.hk/alse/idba_ud | ||||||
Persistent Identifier | http://hdl.handle.net/10722/152505 | ||||||
ISSN | 2023 Impact Factor: 4.4 2023 SCImago Journal Rankings: 2.574 | ||||||
ISI Accession Number ID |
Funding Information: This work was supported in part by HKGRF funding (HKU 7116/08E, HKU 719709E) and HKU Genomics SRT funding. | ||||||
References | |||||||
Grants |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Peng, Y | en_US |
dc.contributor.author | Leung, HCM | en_US |
dc.contributor.author | Yiu, SM | en_US |
dc.contributor.author | Chin, FYL | en_US |
dc.date.accessioned | 2012-06-26T06:39:47Z | - |
dc.date.available | 2012-06-26T06:39:47Z | - |
dc.date.issued | 2012 | en_US |
dc.identifier.citation | Bioinformatics, 2012, v. 28 n. 11, p. 1420-1428 | en_US |
dc.identifier.issn | 1367-4803 | en_US |
dc.identifier.uri | http://hdl.handle.net/10722/152505 | - |
dc.description.abstract | MOTIVATION: Next-generation sequencing allows us to sequence reads from a microbial environment using single-cell sequencing or metagenomic sequencing technologies. However, both technologies suffer from the problem that sequencing depth of different regions of a genome or genomes from different species are highly uneven. Most existing genome assemblers usually have an assumption that sequencing depths are even. These assemblers fail to construct correct long contigs. RESULTS: We introduce the IDBA-UD algorithm that is based on the de Bruijn graph approach for assembling reads from single-cell sequencing or metagenomic sequencing technologies with uneven sequencing depths. Several non-trivial techniques have been employed to tackle the problems. Instead of using a simple threshold, we use multiple depthrelative thresholds to remove erroneous k-mers in both low-depth and high-depth regions. The technique of local assembly with paired-end information is used to solve the branch problem of low-depth short repeat regions. To speed up the process, an error correction step is conducted to correct reads of high-depth regions that can be aligned to highconfident contigs. Comparison of the performances of IDBA-UD and existing assemblers (Velvet, Velvet-SC, SOAPdenovo and Meta-IDBA) for different datasets, shows that IDBA-UD can reconstruct longer contigs with higher accuracy. AVAILABILITY: The IDBA-UD toolkit is available at our website http://www.cs.hku.hk/alse/idba_ud | en_US |
dc.language | eng | en_US |
dc.publisher | Oxford University Press. The Journal's web site is located at http://bioinformatics.oxfordjournals.org/ | en_US |
dc.relation.ispartof | Bioinformatics | en_US |
dc.rights | This is a pre-copy-editing, author-produced PDF of an article accepted for publication in Bioinformatics following peer review. The definitive publisher-authenticated version Bioinformatics, 2012, v. 28 n. 11, p. 1420-1428 is available online at: http://bioinformatics.oxfordjournals.org/content/28/11/1420 | - |
dc.title | IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth | en_US |
dc.type | Article | en_US |
dc.identifier.email | Leung, HCM: cmleung2@cs.hku.hk | en_US |
dc.identifier.email | Yiu, SM: smyiu@cs.hku.hk | - |
dc.identifier.email | Chin, FYL: chin@cs.hku.hk | - |
dc.identifier.authority | Chin, FYL=rp00105 | en_US |
dc.description.nature | postprint | en_US |
dc.identifier.doi | 10.1093/bioinformatics/bts174 | en_US |
dc.identifier.pmid | 22495754 | - |
dc.identifier.scopus | eid_2-s2.0-84861760530 | en_US |
dc.identifier.hkuros | 202752 | - |
dc.relation.references | http://www.scopus.com/mlt/select.url?eid=2-s2.0-84861760530&selection=ref&src=s&origin=recordpage | en_US |
dc.identifier.volume | 28 | en_US |
dc.identifier.issue | 11 | en_US |
dc.identifier.spage | 1420 | en_US |
dc.identifier.epage | 1428 | en_US |
dc.identifier.eissn | 1460-2059 | - |
dc.identifier.isi | WOS:000304537000002 | - |
dc.publisher.place | United Kingdom | en_US |
dc.relation.project | Algorithms for Inferring k-articulated Phylogenetic Network | - |
dc.identifier.scopusauthorid | Chin, FYL=7005101915 | en_US |
dc.identifier.scopusauthorid | Yiu, SM=55146840600 | en_US |
dc.identifier.scopusauthorid | Leung, HCM=55236908900 | en_US |
dc.identifier.scopusauthorid | Peng, Y=30267885400 | en_US |
dc.identifier.citeulike | 10559166 | - |