File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1093/dnares/dsac039
- Scopus: eid_2-s2.0-85142940438
- PMID: 36308393
- WOS: WOS:000892327200001
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: Assembly-free discovery of human novel sequences using long reads.
Title | Assembly-free discovery of human novel sequences using long reads. |
---|---|
Authors | |
Keywords | assembly-free approach human references long reads novel sequences |
Issue Date | 1-Dec-2022 |
Publisher | Oxford University Press |
Citation | DNA Research, 2022, v. 29, n. 6 How to Cite? |
Abstract | DNA sequences that are absent in the human reference genome are classified as novel sequences. The discovery of these missed sequences is crucial for exploring the genomic diversity of populations and understanding the genetic basis of human diseases. However, various DNA lengths of reads generated from different sequencing technologies can significantly affect the results of novel sequences. In this work, we designed an assembly-free novel sequence (AF-NS) approach to identify novel sequences from Oxford Nanopore Technology long reads. Among the newly detected sequences using AF-NS, more than 95% were omitted from those using long-read assemblers and 85% were not present in short reads of Illumina. We identified the common novel sequences among all the samples and revealed their association with the binding motifs of transcription factors. Regarding the placements of the novel sequences, we found about 70% enriched in repeat regions and generated 430 for one specific subpopulation that might be related to their evolution. Our study demonstrates the advance of the assembly-free approach to capture more novel sequences over other assembler based methods. Combining the long-read data with powerful analytical methods can be a robust way to improve the completeness of novel sequences. |
Persistent Identifier | http://hdl.handle.net/10722/340462 |
ISSN | 2023 Impact Factor: 3.9 2023 SCImago Journal Rankings: 1.131 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Li, Q | - |
dc.contributor.author | Yan, B | - |
dc.contributor.author | Lam, TW | - |
dc.contributor.author | Luo, R | - |
dc.date.accessioned | 2024-03-11T10:44:49Z | - |
dc.date.available | 2024-03-11T10:44:49Z | - |
dc.date.issued | 2022-12-01 | - |
dc.identifier.citation | DNA Research, 2022, v. 29, n. 6 | - |
dc.identifier.issn | 1340-2838 | - |
dc.identifier.uri | http://hdl.handle.net/10722/340462 | - |
dc.description.abstract | <p>DNA sequences that are absent in the human reference genome are classified as novel sequences. The discovery of these missed sequences is crucial for exploring the genomic diversity of populations and understanding the genetic basis of human diseases. However, various DNA lengths of reads generated from different sequencing technologies can significantly affect the results of novel sequences. In this work, we designed an assembly-free novel sequence (AF-NS) approach to identify novel sequences from Oxford Nanopore Technology long reads. Among the newly detected sequences using AF-NS, more than 95% were omitted from those using long-read assemblers and 85% were not present in short reads of Illumina. We identified the common novel sequences among all the samples and revealed their association with the binding motifs of transcription factors. Regarding the placements of the novel sequences, we found about 70% enriched in repeat regions and generated 430 for one specific subpopulation that might be related to their evolution. Our study demonstrates the advance of the assembly-free approach to capture more novel sequences over other assembler based methods. Combining the long-read data with powerful analytical methods can be a robust way to improve the completeness of novel sequences.</p> | - |
dc.language | eng | - |
dc.publisher | Oxford University Press | - |
dc.relation.ispartof | DNA Research | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject | assembly-free approach | - |
dc.subject | human references | - |
dc.subject | long reads | - |
dc.subject | novel sequences | - |
dc.title | Assembly-free discovery of human novel sequences using long reads. | - |
dc.type | Article | - |
dc.identifier.doi | 10.1093/dnares/dsac039 | - |
dc.identifier.pmid | 36308393 | - |
dc.identifier.scopus | eid_2-s2.0-85142940438 | - |
dc.identifier.volume | 29 | - |
dc.identifier.issue | 6 | - |
dc.identifier.eissn | 1756-1663 | - |
dc.identifier.isi | WOS:000892327200001 | - |
dc.identifier.issnl | 1340-2838 | - |