Some topics on statistical analysis of genetic imprinting data and microbiome compositional data

Xia, Fan; 夏凡

File Download

FullText.pdf

Links for fulltext

(May Require Subscription)

DOI: 10.5353/th_b5223971

Supplementary

Citations:
Appears in Collections:
- Statistics & Actuarial Science: Theses
- HKU Theses Online

postgraduate thesis: Some topics on statistical analysis of genetic imprinting data and microbiome compositional data

Title	Some topics on statistical analysis of genetic imprinting data and microbiome compositional data
Authors	Xia, Fan 夏凡
Advisors	Advisor(s):Fung, TWK
Issue Date	2014
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Xia, F. [夏凡]. (2014). Some topics on statistical analysis of genetic imprinting data and microbiome compositional data. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5223971
Abstract	Genetic association study is a useful tool to identify the genetic component that is responsible for a disease. The phenomenon that a certain gene expresses in a parent-of-origin manner is referred to as genomic imprinting. When a gene is imprinted, the performance of the disease-association study will be affected. This thesis presents statistical testing methods developed specially for nuclear family data centering around the genetic association studies incorporating imprinting effects. For qualitative diseases with binary outcomes, a class of TDTI* type tests was proposed in a general two-stage framework, where the imprinting effects were examined prior to association testing. On quantitative trait loci, a class of Q-TDTI(c) type tests and another class of Q-MAX(c) type tests were proposed. The proposed testing methods flexibly accommodate families with missing parental genotype and with multiple siblings. The performance of all the methods was verified by simulation studies. It was found that the proposed methods improve the testing power for detecting association in the presence of imprinting. The class of TDTI* tests was applied to a rheumatoid arthritis study data. Also, the class of Q-TDTI(c) tests was applied to analyze the Framingham Heart Study data. The human microbiome is the collection of the microbiota, together with their genomes and their habitats throughout the human body. The human microbiome comprises an inalienable part of our genetic landscape and contributes to our metabolic features. Also, current studies have suggested the variety of human microbiome in human diseases. With the high-throughput DNA sequencing, the human microbiome composition can be characterized based on bacterial taxa relative abundance and the phylogenetic constraint. Such taxa data are often high-dimensional overdispersed and contain excessive number of zeros. Taking into account of these characteristics in taxa data, this thesis presents statistical methods to identify associations between covariate/outcome and the human microbiome composition. To assess environmental/biological covariate effect to microbiome composition, an additive logistic normal multinomial regression model was proposed and a group l1 penalized likelihood estimation method was further developed to facilitate selection of covariates and estimation of parameters. To identify microbiome components associated with biological/clinical outcomes, a Bayesian hierarchical regression model with spike and slab prior for variable selection was proposed and a Markov chain Monte Carlo algorithm that combines stochastic variable selection procedure and random walk metropolis-hasting steps was developed for model estimation. Both of the methods were illustrated using simulations as well as a real human gut microbiome dataset from The Penn Gut Microbiome Project.
Degree	Doctor of Philosophy
Subject	Genomic imprinting - Statistical methods Body, Human - Microbiology - Statistical methods
Dept/Program	Statistics and Actuarial Science
Persistent Identifier	http://hdl.handle.net/10722/206673
HKU Library Item ID	b5223971

DC Field	Value	Language
dc.contributor.advisor	Fung, TWK	-
dc.contributor.author	Xia, Fan	-
dc.contributor.author	夏凡	-
dc.date.accessioned	2014-11-25T03:53:15Z	-
dc.date.available	2014-11-25T03:53:15Z	-
dc.date.issued	2014	-
dc.identifier.citation	Xia, F. [夏凡]. (2014). Some topics on statistical analysis of genetic imprinting data and microbiome compositional data. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5223971	-
dc.identifier.uri	http://hdl.handle.net/10722/206673	-
dc.description.abstract	Genetic association study is a useful tool to identify the genetic component that is responsible for a disease. The phenomenon that a certain gene expresses in a parent-of-origin manner is referred to as genomic imprinting. When a gene is imprinted, the performance of the disease-association study will be affected. This thesis presents statistical testing methods developed specially for nuclear family data centering around the genetic association studies incorporating imprinting effects. For qualitative diseases with binary outcomes, a class of TDTI* type tests was proposed in a general two-stage framework, where the imprinting effects were examined prior to association testing. On quantitative trait loci, a class of Q-TDTI(c) type tests and another class of Q-MAX(c) type tests were proposed. The proposed testing methods flexibly accommodate families with missing parental genotype and with multiple siblings. The performance of all the methods was verified by simulation studies. It was found that the proposed methods improve the testing power for detecting association in the presence of imprinting. The class of TDTI* tests was applied to a rheumatoid arthritis study data. Also, the class of Q-TDTI(c) tests was applied to analyze the Framingham Heart Study data. The human microbiome is the collection of the microbiota, together with their genomes and their habitats throughout the human body. The human microbiome comprises an inalienable part of our genetic landscape and contributes to our metabolic features. Also, current studies have suggested the variety of human microbiome in human diseases. With the high-throughput DNA sequencing, the human microbiome composition can be characterized based on bacterial taxa relative abundance and the phylogenetic constraint. Such taxa data are often high-dimensional overdispersed and contain excessive number of zeros. Taking into account of these characteristics in taxa data, this thesis presents statistical methods to identify associations between covariate/outcome and the human microbiome composition. To assess environmental/biological covariate effect to microbiome composition, an additive logistic normal multinomial regression model was proposed and a group l1 penalized likelihood estimation method was further developed to facilitate selection of covariates and estimation of parameters. To identify microbiome components associated with biological/clinical outcomes, a Bayesian hierarchical regression model with spike and slab prior for variable selection was proposed and a Markov chain Monte Carlo algorithm that combines stochastic variable selection procedure and random walk metropolis-hasting steps was developed for model estimation. Both of the methods were illustrated using simulations as well as a real human gut microbiome dataset from The Penn Gut Microbiome Project.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.subject.lcsh	Genomic imprinting - Statistical methods	-
dc.subject.lcsh	Body, Human - Microbiology - Statistical methods	-
dc.title	Some topics on statistical analysis of genetic imprinting data and microbiome compositional data	-
dc.type	PG_Thesis	-
dc.identifier.hkul	b5223971	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Statistics and Actuarial Science	-
dc.description.nature	published_or_final_version	-
dc.identifier.doi	10.5353/th_b5223971	-
dc.identifier.mmsid	991037035019703414	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

postgraduate thesis: Some topics on statistical analysis of genetic imprinting data and microbiome compositional data

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats