On some extensions of generalized linear models with varying dispersion

Wu, Ka-yui, Karl.; 胡家銳.

File Download

FullText.pdf

Links for fulltext

(May Require Subscription)

DOI: 10.5353/th_b4819937

Supplementary

Citations:
Appears in Collections:
- Statistics & Actuarial Science: Theses
- HKU Theses Online

postgraduate thesis: On some extensions of generalized linear models with varying dispersion

Title	On some extensions of generalized linear models with varying dispersion
Authors	Wu, Ka-yui, Karl.胡家銳.
Advisors	Advisor(s):Li, WK
Issue Date	2012
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Wu, K. K. [胡家銳]. (2012). On some extensions of generalized linear models with varying dispersion. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b4819937
Abstract	When dealing with exponential family distributions, a constant dispersion is often assumed since it simplifies both model formulation and estimation. In contrast, heteroscedasticity is a common feature of almost every empirical data set. In this dissertation, the dispersion parameter is no longer considered as constant throughout the entire sample, but defined as the expected deviance of the individual response yi and its expected value _i such that it will be expressed as a linear combination of some covariates and their coefficients. At the same time, the dispersion regression is an essential part of a double Generalized Linear Model in which mean and dispersion are modelled in two interlinked and pseudo-simultaneously estimated submodels. In other words, the deviance is a function of the response mean which on the other hand depends on the dispersion. Due to the mutual dependency, the estimation algorithm will be iterated as long as the improvement of the one parameter leads to significant changes of the other until it is not the case. If appropriate covariates are chosen, the model’s goodness of fit should be improved by the property that the dispersion is estimated by external information instead of being a constant. In the following, the advantage of dispersion modelling will be shown by its application on three different types of data: a) zero-inflated data, b) non-linear time series data, and c) clinical trials data. All these data follow distributions of the exponential family for which the application of the Generalized Linear Model is justified, but require certain extensions of modelling methodologies. In this dissertation, The enhanced goodness of fit given that the constant dispersion assumption is dropped will be shown in the above listed examples. In fact, by formulating and carrying out score and Wald tests on testing for the possible occurrence of varying dispersion, evidence of heterogeneous dispersion could be found to be present in the data sets considered. Furthermore, although model formulation, asymptotic properties and computational effort are more extensive when dealing with the double models, the benefits and advantages in terms of improved fitting results and more efficient parameter estimates appear to justify the additional effort not only for the types of data introduced, but also generally for empirical data analysis, on different types of data as well.
Degree	Doctor of Philosophy
Subject	Linear models (Statistics)
Dept/Program	Statistics and Actuarial Science
Persistent Identifier	http://hdl.handle.net/10722/167213
HKU Library Item ID	b4819937

DC Field	Value	Language
dc.contributor.advisor	Li, WK	-
dc.contributor.author	Wu, Ka-yui, Karl.	-
dc.contributor.author	胡家銳.	-
dc.date.issued	2012	-
dc.identifier.citation	Wu, K. K. [胡家銳]. (2012). On some extensions of generalized linear models with varying dispersion. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b4819937	-
dc.identifier.uri	http://hdl.handle.net/10722/167213	-
dc.description.abstract	When dealing with exponential family distributions, a constant dispersion is often assumed since it simplifies both model formulation and estimation. In contrast, heteroscedasticity is a common feature of almost every empirical data set. In this dissertation, the dispersion parameter is no longer considered as constant throughout the entire sample, but defined as the expected deviance of the individual response yi and its expected value _i such that it will be expressed as a linear combination of some covariates and their coefficients. At the same time, the dispersion regression is an essential part of a double Generalized Linear Model in which mean and dispersion are modelled in two interlinked and pseudo-simultaneously estimated submodels. In other words, the deviance is a function of the response mean which on the other hand depends on the dispersion. Due to the mutual dependency, the estimation algorithm will be iterated as long as the improvement of the one parameter leads to significant changes of the other until it is not the case. If appropriate covariates are chosen, the model’s goodness of fit should be improved by the property that the dispersion is estimated by external information instead of being a constant. In the following, the advantage of dispersion modelling will be shown by its application on three different types of data: a) zero-inflated data, b) non-linear time series data, and c) clinical trials data. All these data follow distributions of the exponential family for which the application of the Generalized Linear Model is justified, but require certain extensions of modelling methodologies. In this dissertation, The enhanced goodness of fit given that the constant dispersion assumption is dropped will be shown in the above listed examples. In fact, by formulating and carrying out score and Wald tests on testing for the possible occurrence of varying dispersion, evidence of heterogeneous dispersion could be found to be present in the data sets considered. Furthermore, although model formulation, asymptotic properties and computational effort are more extensive when dealing with the double models, the benefits and advantages in terms of improved fitting results and more efficient parameter estimates appear to justify the additional effort not only for the types of data introduced, but also generally for empirical data analysis, on different types of data as well.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.source.uri	http://hub.hku.hk/bib/B48199370	-
dc.subject.lcsh	Linear models (Statistics)	-
dc.title	On some extensions of generalized linear models with varying dispersion	-
dc.type	PG_Thesis	-
dc.identifier.hkul	b4819937	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Statistics and Actuarial Science	-
dc.description.nature	published_or_final_version	-
dc.identifier.doi	10.5353/th_b4819937	-
dc.date.hkucongregation	2012	-
dc.identifier.mmsid	991033761429703414	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

postgraduate thesis: On some extensions of generalized linear models with varying dispersion

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats