File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Computational methodologies for functional analyses of somatic mutations in cancer
Title | Computational methodologies for functional analyses of somatic mutations in cancer |
---|---|
Authors | |
Advisors | |
Issue Date | 2019 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Xu, H. [徐航]. (2019). Computational methodologies for functional analyses of somatic mutations in cancer. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Somatic mutations play critical roles in tumorigenesis and metastasis. Recent improvements in Next Generation Sequencing (NGS) technology facilitate the detection of mutations. However, because of the complexity and heterogeneity of tumor genomes, efficient and accurate interpretations of functions for mutations remain challenging, especially for those mutations within non-coding regions. Encyclopedia of DNA Elements (ENCODE) and other similar studies have identified many cis-regulatory elements, providing important resources for mutation annotation. Moreover, 3C-based techniques have enabled the analysis of the tissue-specific spatial organization of chromatin, or 3D genome architecture, providing a detailed picture of the cellular gene regulatory landscape. These resources and technologies provide great opportunities for annotating and interpreting somatic mutations in cancer genome.
In this dissertation, I describe new methods to detect and interpret the somatic mutations in cancer genome, utilizing available resources. I carried out three projects with different goals. In the first project, I joined a collaboration to analyze somatic mutations in Non-Small Cell Lung Cancer samples, using relatively traditional methodologies. In the second project, I developed a new method for including 3D-genome features in explaining the function of Structural Variants. In the last project, I introduced a novel method based on Deep Learning to detect chromatin loops mediated by Cohesin/CTCF, which are very important for gene regulation. The following paragraphs are more detailed descriptions of the three projects.
First, with the Whole Exome Sequencing data of 40 Non-Small Cell Lung Cancer samples, I detected somatic SNVs, INDELs and CNVs, mainly by the GATK pipeline. I analyzed the mutations with various methods and explored the association between mutations and drug response. EGFR and TP53 were found to be the most common driver genes. In drug response analyses, no gene was significantly associated with drug resistance.
Second, inspired by previous studies showing disruption of Topologically Associating Domains (TADs) in tumor samples, I designed a correlation analysis between mutations and transcription to detect Structural Variants which possibly influence nearby genes by disrupting 3D genome architectures. My analyses successfully identify cases found in previous studies including IGF2 and IRS4, and detect several new results, such as OR7D2 .
Third, despite the great successes achieved by 3C-based techniques, there are difficulties such as low accuracy, low resolution and high expense for these techniques. To sidestep the limitations of current 3C-based technologies, I developed a Deep Learning model to predict Chromatin Loops mediated by Cohesin. It utilizes Word2Vec and Convolutional Neural Network to discover loop-associated CTCF binding sites and predicts loop intensity with CTCF ChIP-seq data. I also demonstrate that it has higher performance than existing methods, in terms of sensitivity and specificity.
|
Degree | Doctor of Philosophy |
Subject | Cancer - Research |
Dept/Program | Biomedical Sciences |
Persistent Identifier | http://hdl.handle.net/10722/278439 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Sham, PC | - |
dc.contributor.advisor | Wang, JJ | - |
dc.contributor.author | Xu, Hang | - |
dc.contributor.author | 徐航 | - |
dc.date.accessioned | 2019-10-09T01:17:43Z | - |
dc.date.available | 2019-10-09T01:17:43Z | - |
dc.date.issued | 2019 | - |
dc.identifier.citation | Xu, H. [徐航]. (2019). Computational methodologies for functional analyses of somatic mutations in cancer. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/278439 | - |
dc.description.abstract | Somatic mutations play critical roles in tumorigenesis and metastasis. Recent improvements in Next Generation Sequencing (NGS) technology facilitate the detection of mutations. However, because of the complexity and heterogeneity of tumor genomes, efficient and accurate interpretations of functions for mutations remain challenging, especially for those mutations within non-coding regions. Encyclopedia of DNA Elements (ENCODE) and other similar studies have identified many cis-regulatory elements, providing important resources for mutation annotation. Moreover, 3C-based techniques have enabled the analysis of the tissue-specific spatial organization of chromatin, or 3D genome architecture, providing a detailed picture of the cellular gene regulatory landscape. These resources and technologies provide great opportunities for annotating and interpreting somatic mutations in cancer genome. In this dissertation, I describe new methods to detect and interpret the somatic mutations in cancer genome, utilizing available resources. I carried out three projects with different goals. In the first project, I joined a collaboration to analyze somatic mutations in Non-Small Cell Lung Cancer samples, using relatively traditional methodologies. In the second project, I developed a new method for including 3D-genome features in explaining the function of Structural Variants. In the last project, I introduced a novel method based on Deep Learning to detect chromatin loops mediated by Cohesin/CTCF, which are very important for gene regulation. The following paragraphs are more detailed descriptions of the three projects. First, with the Whole Exome Sequencing data of 40 Non-Small Cell Lung Cancer samples, I detected somatic SNVs, INDELs and CNVs, mainly by the GATK pipeline. I analyzed the mutations with various methods and explored the association between mutations and drug response. EGFR and TP53 were found to be the most common driver genes. In drug response analyses, no gene was significantly associated with drug resistance. Second, inspired by previous studies showing disruption of Topologically Associating Domains (TADs) in tumor samples, I designed a correlation analysis between mutations and transcription to detect Structural Variants which possibly influence nearby genes by disrupting 3D genome architectures. My analyses successfully identify cases found in previous studies including IGF2 and IRS4, and detect several new results, such as OR7D2 . Third, despite the great successes achieved by 3C-based techniques, there are difficulties such as low accuracy, low resolution and high expense for these techniques. To sidestep the limitations of current 3C-based technologies, I developed a Deep Learning model to predict Chromatin Loops mediated by Cohesin. It utilizes Word2Vec and Convolutional Neural Network to discover loop-associated CTCF binding sites and predicts loop intensity with CTCF ChIP-seq data. I also demonstrate that it has higher performance than existing methods, in terms of sensitivity and specificity. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Cancer - Research | - |
dc.title | Computational methodologies for functional analyses of somatic mutations in cancer | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Biomedical Sciences | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.5353/th_991044146575603414 | - |
dc.date.hkucongregation | 2019 | - |
dc.identifier.mmsid | 991044146575603414 | - |