File Download
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.12688/f1000research.19426.1
- Scopus: eid_2-s2.0-85081568354
- Find via
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Article: Scavenger: A pipeline for recovery of unaligned reads utilising similarity with aligned reads
Title | Scavenger: A pipeline for recovery of unaligned reads utilising similarity with aligned reads |
---|---|
Authors | |
Keywords | RNA-seq Read alignment Unaligned read Read recovery |
Issue Date | 2019 |
Publisher | Faculty of 1000 Ltd. The Journal's web site is located at http://f1000research.com |
Citation | F1000Research, 2019, v. 8, article no. 1587 How to Cite? |
Abstract | Read alignment is an important step in RNA-seq analysis as the result of alignment forms the basis for downstream analyses. However, recent studies have shown that published alignment tools have variable mapping sensitivity and do not necessarily align all the reads which should have been aligned, a problem we termed as the false-negative non-alignment problem. Here we present Scavenger, a python-based bioinformatics pipeline for recovering unaligned reads using a novel mechanism in which a putative alignment location is discovered based on sequence similarity between aligned and unaligned reads. We showed that Scavenger could recover unaligned reads in a range of simulated and real RNA-seq datasets, including single-cell RNA-seq data. We found that recovered reads tend to contain more genetic variants with respect to the reference genome compared to previously aligned reads, indicating that divergence between personal and reference genomes plays a role in the false-negative non-alignment problem. Even when the number of recovered reads is relatively small compared to the total number of reads, the addition of these recovered reads can impact downstream analyses, especially in terms of estimating the expression and differential expression of lowly expressed genes, such as pseudogenes.
Keywords |
Description | Collection: Python |
Persistent Identifier | http://hdl.handle.net/10722/276220 |
ISSN | 2023 SCImago Journal Rankings: 0.821 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Yang, A | - |
dc.contributor.author | Tang, JYS | - |
dc.contributor.author | Troup, M | - |
dc.contributor.author | Ho, JWK | - |
dc.date.accessioned | 2019-09-10T02:58:26Z | - |
dc.date.available | 2019-09-10T02:58:26Z | - |
dc.date.issued | 2019 | - |
dc.identifier.citation | F1000Research, 2019, v. 8, article no. 1587 | - |
dc.identifier.issn | 2046-1402 | - |
dc.identifier.uri | http://hdl.handle.net/10722/276220 | - |
dc.description | Collection: Python | - |
dc.description.abstract | Read alignment is an important step in RNA-seq analysis as the result of alignment forms the basis for downstream analyses. However, recent studies have shown that published alignment tools have variable mapping sensitivity and do not necessarily align all the reads which should have been aligned, a problem we termed as the false-negative non-alignment problem. Here we present Scavenger, a python-based bioinformatics pipeline for recovering unaligned reads using a novel mechanism in which a putative alignment location is discovered based on sequence similarity between aligned and unaligned reads. We showed that Scavenger could recover unaligned reads in a range of simulated and real RNA-seq datasets, including single-cell RNA-seq data. We found that recovered reads tend to contain more genetic variants with respect to the reference genome compared to previously aligned reads, indicating that divergence between personal and reference genomes plays a role in the false-negative non-alignment problem. Even when the number of recovered reads is relatively small compared to the total number of reads, the addition of these recovered reads can impact downstream analyses, especially in terms of estimating the expression and differential expression of lowly expressed genes, such as pseudogenes. Keywords | - |
dc.language | eng | - |
dc.publisher | Faculty of 1000 Ltd. The Journal's web site is located at http://f1000research.com | - |
dc.relation.ispartof | F1000Research | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject | RNA-seq | - |
dc.subject | Read alignment | - |
dc.subject | Unaligned read | - |
dc.subject | Read recovery | - |
dc.title | Scavenger: A pipeline for recovery of unaligned reads utilising similarity with aligned reads | - |
dc.type | Article | - |
dc.identifier.email | Ho, JWK: jwkho@hku.hk | - |
dc.identifier.authority | Ho, JWK=rp02436 | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.12688/f1000research.19426.1 | - |
dc.identifier.scopus | eid_2-s2.0-85081568354 | - |
dc.identifier.hkuros | 303422 | - |
dc.identifier.volume | 8 | - |
dc.identifier.spage | article no. 1587 | - |
dc.identifier.epage | article no. 1587 | - |
dc.publisher.place | United Kingdom | - |
dc.identifier.issnl | 2046-1402 | - |