File Download
Supplementary

postgraduate thesis: Detection of genomic variations with data from multiple sequencing platforms

TitleDetection of genomic variations with data from multiple sequencing platforms
Authors
Advisors
Issue Date2023
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Yu, H. [余卉菁]. (2023). Detection of genomic variations with data from multiple sequencing platforms. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractThe exploration of genomic variations through DNA sequencing data has enriched our comprehension of genetic diversity, unlocking insights into disease association and clinical diagnosis. As sequencing technologies rapidly evolve, bioinformatics methods for detecting genomic variations have progressed, offering accurate variant calls. Nonetheless, challenges persist in detecting variations, contingent upon data availability in clinical and research contexts. Distinct features of sequencing data across sequencing platforms— read length and error profiles—demand tailored approaches. This thesis presents three novel methods (namely, SENSV, Clair3-MP and RepDEL) for different kinds of whole sequencing data to detect genomic variations, even combined data, for accommodating clinical and research objectives. Starting with a comprehension of the specific requirements for each application and an investigation into the limitations of existing methodologies or studies, we carefully devised systematic and innovative approaches to effectively tackle the specified objectives. Subsequently, rigorous experiments were conducted using gold standard evaluation techniques to validate the efficacy and viability of the proposed methods. SENSV is designed to detect structural variations (SVs) using low- depth Oxford Nanopore Technology sequencing data while improving precise breakpoint resolution. SENSV offers precise detection of SVs with low-depth nanopore sequencing data, and these are important in clinical diagnosis of genetic diseases. Clair3-MP is tailored to focus the other types of genomic variations – SNPs and indels, through combining sequencing data from different sequencing platforms. Clair3-MP has elevated the performance for small variant detection in difficult genomic regions. On the other hand, this study also provided insights on the optimal scenarios for variant-calling using multi-platform data through a series of experiments. Clair3-MP offers an accessible platform for research harnessing multi-platform data, empowering them to enhance the variant-calling performance. RepDEL aims to detect long deletions in repeat regions, which are known to be challenging in the human genome, using paired-end next- generation sequencing (NGS) data. While longer reads, as demonstrated by SENSV, are inherently more suitable for detecting larger genomic variations, their higher cost poses a constraint on their applicability within cohort studies. Using NGS data with better affordability, RepDEL has demonstrated the possibility to extend precise deletion detection in longer repeat regions that are beyond NGS read length. The intricate interplay between evolving DNA sequencing technologies and innovative bioinformatics tools has driven us toward more accurate and comprehensive variant calling. This journey has highlighted the challenges posed by different sequencing platforms, necessitating tailored approaches to address data characteristics and accommodate specific needs. Through the lens of SENSV, Clair3-MP, and RepDEL, this thesis has showed how each method is developed to leverage the strengths of sequencing data, ultimately enhancing our capacity to uncover and understand genomic variations.
DegreeDoctor of Philosophy
SubjectGenomics - Data processing
Bioinformatics
Dept/ProgramComputer Science
Persistent Identifierhttp://hdl.handle.net/10722/342883

 

DC FieldValueLanguage
dc.contributor.advisorLam, TW-
dc.contributor.advisorLuo, R-
dc.contributor.advisorTing, HF-
dc.contributor.authorYu, Huijing-
dc.contributor.author余卉菁-
dc.date.accessioned2024-05-07T01:22:08Z-
dc.date.available2024-05-07T01:22:08Z-
dc.date.issued2023-
dc.identifier.citationYu, H. [余卉菁]. (2023). Detection of genomic variations with data from multiple sequencing platforms. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/342883-
dc.description.abstractThe exploration of genomic variations through DNA sequencing data has enriched our comprehension of genetic diversity, unlocking insights into disease association and clinical diagnosis. As sequencing technologies rapidly evolve, bioinformatics methods for detecting genomic variations have progressed, offering accurate variant calls. Nonetheless, challenges persist in detecting variations, contingent upon data availability in clinical and research contexts. Distinct features of sequencing data across sequencing platforms— read length and error profiles—demand tailored approaches. This thesis presents three novel methods (namely, SENSV, Clair3-MP and RepDEL) for different kinds of whole sequencing data to detect genomic variations, even combined data, for accommodating clinical and research objectives. Starting with a comprehension of the specific requirements for each application and an investigation into the limitations of existing methodologies or studies, we carefully devised systematic and innovative approaches to effectively tackle the specified objectives. Subsequently, rigorous experiments were conducted using gold standard evaluation techniques to validate the efficacy and viability of the proposed methods. SENSV is designed to detect structural variations (SVs) using low- depth Oxford Nanopore Technology sequencing data while improving precise breakpoint resolution. SENSV offers precise detection of SVs with low-depth nanopore sequencing data, and these are important in clinical diagnosis of genetic diseases. Clair3-MP is tailored to focus the other types of genomic variations – SNPs and indels, through combining sequencing data from different sequencing platforms. Clair3-MP has elevated the performance for small variant detection in difficult genomic regions. On the other hand, this study also provided insights on the optimal scenarios for variant-calling using multi-platform data through a series of experiments. Clair3-MP offers an accessible platform for research harnessing multi-platform data, empowering them to enhance the variant-calling performance. RepDEL aims to detect long deletions in repeat regions, which are known to be challenging in the human genome, using paired-end next- generation sequencing (NGS) data. While longer reads, as demonstrated by SENSV, are inherently more suitable for detecting larger genomic variations, their higher cost poses a constraint on their applicability within cohort studies. Using NGS data with better affordability, RepDEL has demonstrated the possibility to extend precise deletion detection in longer repeat regions that are beyond NGS read length. The intricate interplay between evolving DNA sequencing technologies and innovative bioinformatics tools has driven us toward more accurate and comprehensive variant calling. This journey has highlighted the challenges posed by different sequencing platforms, necessitating tailored approaches to address data characteristics and accommodate specific needs. Through the lens of SENSV, Clair3-MP, and RepDEL, this thesis has showed how each method is developed to leverage the strengths of sequencing data, ultimately enhancing our capacity to uncover and understand genomic variations.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshGenomics - Data processing-
dc.subject.lcshBioinformatics-
dc.titleDetection of genomic variations with data from multiple sequencing platforms-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineComputer Science-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2024-
dc.identifier.mmsid991044791815203414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats