File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

postgraduate thesis: Computational methodologies for functional analyses of somatic mutations in cancer

TitleComputational methodologies for functional analyses of somatic mutations in cancer
Authors
Advisors
Advisor(s):Sham, PCWang, JJ
Issue Date2019
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Xu, H. [徐航]. (2019). Computational methodologies for functional analyses of somatic mutations in cancer. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractSomatic mutations play critical roles in tumorigenesis and metastasis. Recent improvements in Next Generation Sequencing (NGS) technology facilitate the detection of mutations. However, because of the complexity and heterogeneity of tumor genomes, efficient and accurate interpretations of functions for mutations remain challenging, especially for those mutations within non-coding regions. Encyclopedia of DNA Elements (ENCODE) and other similar studies have identified many cis-regulatory elements, providing important resources for mutation annotation. Moreover, 3C-based techniques have enabled the analysis of the tissue-specific spatial organization of chromatin, or 3D genome architecture, providing a detailed picture of the cellular gene regulatory landscape. These resources and technologies provide great opportunities for annotating and interpreting somatic mutations in cancer genome. In this dissertation, I describe new methods to detect and interpret the somatic mutations in cancer genome, utilizing available resources. I carried out three projects with different goals. In the first project, I joined a collaboration to analyze somatic mutations in Non-Small Cell Lung Cancer samples, using relatively traditional methodologies. In the second project, I developed a new method for including 3D-genome features in explaining the function of Structural Variants. In the last project, I introduced a novel method based on Deep Learning to detect chromatin loops mediated by Cohesin/CTCF, which are very important for gene regulation. The following paragraphs are more detailed descriptions of the three projects. First, with the Whole Exome Sequencing data of 40 Non-Small Cell Lung Cancer samples, I detected somatic SNVs, INDELs and CNVs, mainly by the GATK pipeline. I analyzed the mutations with various methods and explored the association between mutations and drug response. EGFR and TP53 were found to be the most common driver genes. In drug response analyses, no gene was significantly associated with drug resistance. Second, inspired by previous studies showing disruption of Topologically Associating Domains (TADs) in tumor samples, I designed a correlation analysis between mutations and transcription to detect Structural Variants which possibly influence nearby genes by disrupting 3D genome architectures. My analyses successfully identify cases found in previous studies including IGF2 and IRS4, and detect several new results, such as OR7D2 . Third, despite the great successes achieved by 3C-based techniques, there are difficulties such as low accuracy, low resolution and high expense for these techniques. To sidestep the limitations of current 3C-based technologies, I developed a Deep Learning model to predict Chromatin Loops mediated by Cohesin. It utilizes Word2Vec and Convolutional Neural Network to discover loop-associated CTCF binding sites and predicts loop intensity with CTCF ChIP-seq data. I also demonstrate that it has higher performance than existing methods, in terms of sensitivity and specificity.
DegreeDoctor of Philosophy
SubjectCancer - Research
Dept/ProgramBiomedical Sciences
Persistent Identifierhttp://hdl.handle.net/10722/278439

 

DC FieldValueLanguage
dc.contributor.advisorSham, PC-
dc.contributor.advisorWang, JJ-
dc.contributor.authorXu, Hang-
dc.contributor.author徐航-
dc.date.accessioned2019-10-09T01:17:43Z-
dc.date.available2019-10-09T01:17:43Z-
dc.date.issued2019-
dc.identifier.citationXu, H. [徐航]. (2019). Computational methodologies for functional analyses of somatic mutations in cancer. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/278439-
dc.description.abstractSomatic mutations play critical roles in tumorigenesis and metastasis. Recent improvements in Next Generation Sequencing (NGS) technology facilitate the detection of mutations. However, because of the complexity and heterogeneity of tumor genomes, efficient and accurate interpretations of functions for mutations remain challenging, especially for those mutations within non-coding regions. Encyclopedia of DNA Elements (ENCODE) and other similar studies have identified many cis-regulatory elements, providing important resources for mutation annotation. Moreover, 3C-based techniques have enabled the analysis of the tissue-specific spatial organization of chromatin, or 3D genome architecture, providing a detailed picture of the cellular gene regulatory landscape. These resources and technologies provide great opportunities for annotating and interpreting somatic mutations in cancer genome. In this dissertation, I describe new methods to detect and interpret the somatic mutations in cancer genome, utilizing available resources. I carried out three projects with different goals. In the first project, I joined a collaboration to analyze somatic mutations in Non-Small Cell Lung Cancer samples, using relatively traditional methodologies. In the second project, I developed a new method for including 3D-genome features in explaining the function of Structural Variants. In the last project, I introduced a novel method based on Deep Learning to detect chromatin loops mediated by Cohesin/CTCF, which are very important for gene regulation. The following paragraphs are more detailed descriptions of the three projects. First, with the Whole Exome Sequencing data of 40 Non-Small Cell Lung Cancer samples, I detected somatic SNVs, INDELs and CNVs, mainly by the GATK pipeline. I analyzed the mutations with various methods and explored the association between mutations and drug response. EGFR and TP53 were found to be the most common driver genes. In drug response analyses, no gene was significantly associated with drug resistance. Second, inspired by previous studies showing disruption of Topologically Associating Domains (TADs) in tumor samples, I designed a correlation analysis between mutations and transcription to detect Structural Variants which possibly influence nearby genes by disrupting 3D genome architectures. My analyses successfully identify cases found in previous studies including IGF2 and IRS4, and detect several new results, such as OR7D2 . Third, despite the great successes achieved by 3C-based techniques, there are difficulties such as low accuracy, low resolution and high expense for these techniques. To sidestep the limitations of current 3C-based technologies, I developed a Deep Learning model to predict Chromatin Loops mediated by Cohesin. It utilizes Word2Vec and Convolutional Neural Network to discover loop-associated CTCF binding sites and predicts loop intensity with CTCF ChIP-seq data. I also demonstrate that it has higher performance than existing methods, in terms of sensitivity and specificity. -
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshCancer - Research-
dc.titleComputational methodologies for functional analyses of somatic mutations in cancer-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineBiomedical Sciences-
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.5353/th_991044146575603414-
dc.date.hkucongregation2019-
dc.identifier.mmsid991044146575603414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats