File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

postgraduate thesis: Some topics on statistical analysis of genetic imprinting data and microbiome compositional data

TitleSome topics on statistical analysis of genetic imprinting data and microbiome compositional data
Authors
Advisors
Advisor(s):Fung, TWK
Issue Date2014
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Xia, F. [夏凡]. (2014). Some topics on statistical analysis of genetic imprinting data and microbiome compositional data. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5223971
AbstractGenetic association study is a useful tool to identify the genetic component that is responsible for a disease. The phenomenon that a certain gene expresses in a parent-of-origin manner is referred to as genomic imprinting. When a gene is imprinted, the performance of the disease-association study will be affected. This thesis presents statistical testing methods developed specially for nuclear family data centering around the genetic association studies incorporating imprinting effects. For qualitative diseases with binary outcomes, a class of TDTI* type tests was proposed in a general two-stage framework, where the imprinting effects were examined prior to association testing. On quantitative trait loci, a class of Q-TDTI(c) type tests and another class of Q-MAX(c) type tests were proposed. The proposed testing methods flexibly accommodate families with missing parental genotype and with multiple siblings. The performance of all the methods was verified by simulation studies. It was found that the proposed methods improve the testing power for detecting association in the presence of imprinting. The class of TDTI* tests was applied to a rheumatoid arthritis study data. Also, the class of Q-TDTI(c) tests was applied to analyze the Framingham Heart Study data. The human microbiome is the collection of the microbiota, together with their genomes and their habitats throughout the human body. The human microbiome comprises an inalienable part of our genetic landscape and contributes to our metabolic features. Also, current studies have suggested the variety of human microbiome in human diseases. With the high-throughput DNA sequencing, the human microbiome composition can be characterized based on bacterial taxa relative abundance and the phylogenetic constraint. Such taxa data are often high-dimensional overdispersed and contain excessive number of zeros. Taking into account of these characteristics in taxa data, this thesis presents statistical methods to identify associations between covariate/outcome and the human microbiome composition. To assess environmental/biological covariate effect to microbiome composition, an additive logistic normal multinomial regression model was proposed and a group l1 penalized likelihood estimation method was further developed to facilitate selection of covariates and estimation of parameters. To identify microbiome components associated with biological/clinical outcomes, a Bayesian hierarchical regression model with spike and slab prior for variable selection was proposed and a Markov chain Monte Carlo algorithm that combines stochastic variable selection procedure and random walk metropolis-hasting steps was developed for model estimation. Both of the methods were illustrated using simulations as well as a real human gut microbiome dataset from The Penn Gut Microbiome Project.
DegreeDoctor of Philosophy
SubjectGenomic imprinting - Statistical methods
Body, Human - Microbiology - Statistical methods
Dept/ProgramStatistics and Actuarial Science
Persistent Identifierhttp://hdl.handle.net/10722/206673

 

DC FieldValueLanguage
dc.contributor.advisorFung, TWK-
dc.contributor.authorXia, Fan-
dc.contributor.author夏凡-
dc.date.accessioned2014-11-25T03:53:15Z-
dc.date.available2014-11-25T03:53:15Z-
dc.date.issued2014-
dc.identifier.citationXia, F. [夏凡]. (2014). Some topics on statistical analysis of genetic imprinting data and microbiome compositional data. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5223971-
dc.identifier.urihttp://hdl.handle.net/10722/206673-
dc.description.abstractGenetic association study is a useful tool to identify the genetic component that is responsible for a disease. The phenomenon that a certain gene expresses in a parent-of-origin manner is referred to as genomic imprinting. When a gene is imprinted, the performance of the disease-association study will be affected. This thesis presents statistical testing methods developed specially for nuclear family data centering around the genetic association studies incorporating imprinting effects. For qualitative diseases with binary outcomes, a class of TDTI* type tests was proposed in a general two-stage framework, where the imprinting effects were examined prior to association testing. On quantitative trait loci, a class of Q-TDTI(c) type tests and another class of Q-MAX(c) type tests were proposed. The proposed testing methods flexibly accommodate families with missing parental genotype and with multiple siblings. The performance of all the methods was verified by simulation studies. It was found that the proposed methods improve the testing power for detecting association in the presence of imprinting. The class of TDTI* tests was applied to a rheumatoid arthritis study data. Also, the class of Q-TDTI(c) tests was applied to analyze the Framingham Heart Study data. The human microbiome is the collection of the microbiota, together with their genomes and their habitats throughout the human body. The human microbiome comprises an inalienable part of our genetic landscape and contributes to our metabolic features. Also, current studies have suggested the variety of human microbiome in human diseases. With the high-throughput DNA sequencing, the human microbiome composition can be characterized based on bacterial taxa relative abundance and the phylogenetic constraint. Such taxa data are often high-dimensional overdispersed and contain excessive number of zeros. Taking into account of these characteristics in taxa data, this thesis presents statistical methods to identify associations between covariate/outcome and the human microbiome composition. To assess environmental/biological covariate effect to microbiome composition, an additive logistic normal multinomial regression model was proposed and a group l1 penalized likelihood estimation method was further developed to facilitate selection of covariates and estimation of parameters. To identify microbiome components associated with biological/clinical outcomes, a Bayesian hierarchical regression model with spike and slab prior for variable selection was proposed and a Markov chain Monte Carlo algorithm that combines stochastic variable selection procedure and random walk metropolis-hasting steps was developed for model estimation. Both of the methods were illustrated using simulations as well as a real human gut microbiome dataset from The Penn Gut Microbiome Project.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsCreative Commons: Attribution 3.0 Hong Kong License-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.subject.lcshGenomic imprinting - Statistical methods-
dc.subject.lcshBody, Human - Microbiology - Statistical methods-
dc.titleSome topics on statistical analysis of genetic imprinting data and microbiome compositional data-
dc.typePG_Thesis-
dc.identifier.hkulb5223971-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineStatistics and Actuarial Science-
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.5353/th_b5223971-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats