File Download
Supplementary

postgraduate thesis: Bayesian censoring approach to rounded zeros in compositional data

TitleBayesian censoring approach to rounded zeros in compositional data
Authors
Advisors
Advisor(s):Bacon-Shone, J
Issue Date2017
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Leung, T. [梁德貞]. (2017). Bayesian censoring approach to rounded zeros in compositional data. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractThe logratio transformation is commonly used for analyzing compositional data. However, logratio analysis is not possible when the data has any zeros. However, zeros are common in data on proportions, concentrations etc. when the data is small enough to be below the detection limit, i.e. left-censored data. Various methods to handle zeros such as combining data and replacing the zeros with predicted non-zero values have been proposed, including additive and multiplicative replacements. The replacement value can be fixed or variable, and can be obtained from various iterative means like modified EM algorithm. In this thesis, Bayesian analysis assuming left, right and interval censoring on the original scale of compositional data is proposed and applied to the 3 different data sets and simulated data. This approach has the advantages of reflecting the data recording process, which necessarily involves censoring, performing well on the simulated data and yielding easy interpretation of results. The new approach is demonstrated by means of models estimated using the standard Markov Chain Monte Carlo(MCMC) package WinBUGS. Pioneering models were also estimated using the new Hamiltonian Monte Carlo(HMC) package, RStan. Unlike the models in WinBUGS and all previous approaches in the literature, the Rstan models are based on censoring over the correct region in the simplex, rather than an approximate censoring region based on the log-ratio space. Models from both tools showed reasonable and consistent results. The advantages and limitations of them are compared and discussed. Apart from zeros, which are left-censored, both methods can handle right-censored and interval censored data as well. The Rstan models are particularly promising. They are built on a simple architecture which uses numerical approximation of the integral of probabilities over the censored region. It is comparatively fast and flexible, allowing for the possibilities of modification with different ways to handle the approximation.
DegreeDoctor of Philosophy
SubjectMultivariate analysis
Mathematical statistics
Dept/ProgramSocial Sciences
Persistent Identifierhttp://hdl.handle.net/10722/244335

 

DC FieldValueLanguage
dc.contributor.advisorBacon-Shone, J-
dc.contributor.authorLeung, Tak-ching-
dc.contributor.author梁德貞-
dc.date.accessioned2017-09-14T04:42:21Z-
dc.date.available2017-09-14T04:42:21Z-
dc.date.issued2017-
dc.identifier.citationLeung, T. [梁德貞]. (2017). Bayesian censoring approach to rounded zeros in compositional data. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/244335-
dc.description.abstractThe logratio transformation is commonly used for analyzing compositional data. However, logratio analysis is not possible when the data has any zeros. However, zeros are common in data on proportions, concentrations etc. when the data is small enough to be below the detection limit, i.e. left-censored data. Various methods to handle zeros such as combining data and replacing the zeros with predicted non-zero values have been proposed, including additive and multiplicative replacements. The replacement value can be fixed or variable, and can be obtained from various iterative means like modified EM algorithm. In this thesis, Bayesian analysis assuming left, right and interval censoring on the original scale of compositional data is proposed and applied to the 3 different data sets and simulated data. This approach has the advantages of reflecting the data recording process, which necessarily involves censoring, performing well on the simulated data and yielding easy interpretation of results. The new approach is demonstrated by means of models estimated using the standard Markov Chain Monte Carlo(MCMC) package WinBUGS. Pioneering models were also estimated using the new Hamiltonian Monte Carlo(HMC) package, RStan. Unlike the models in WinBUGS and all previous approaches in the literature, the Rstan models are based on censoring over the correct region in the simplex, rather than an approximate censoring region based on the log-ratio space. Models from both tools showed reasonable and consistent results. The advantages and limitations of them are compared and discussed. Apart from zeros, which are left-censored, both methods can handle right-censored and interval censored data as well. The Rstan models are particularly promising. They are built on a simple architecture which uses numerical approximation of the integral of probabilities over the censored region. It is comparatively fast and flexible, allowing for the possibilities of modification with different ways to handle the approximation. -
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshMultivariate analysis-
dc.subject.lcshMathematical statistics-
dc.titleBayesian censoring approach to rounded zeros in compositional data-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineSocial Sciences-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2017-
dc.identifier.mmsid991043953697303414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats