File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Statistical diagnostics for longitudinal data analysis : forward search of the GEE method
Title | Statistical diagnostics for longitudinal data analysis : forward search of the GEE method |
---|---|
Authors | |
Issue Date | 2015 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Li, N. [李乃霖]. (2015). Statistical diagnostics for longitudinal data analysis : forward search of the GEE method. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5570813 |
Abstract | In longitudinal data analysis, masking and swamping (MS) are two common effects that can cause severe problems. Successful identification of MS effects is essential to both outlier detection and longitudinal data analysis because ignorance of the MS effects can make the conclusion of analysis totally meaningless and misleading.
In this thesis, a statistical method for analyzing and diagnosing longitudinal data sets is proposed as the forward search of the generalized estimating equation (GEE) method (FSGEE). Starting from an outlier-free initial subset of the data selected using a robust method, FSGEE makes its progress to the next subset by expanding the subset according to the distance of the observations to the GEE model fitted from the current subset.
Through monitoring statistical diagnostics during the forward search process, the forward plots are produced by plotting the diagnostics against the sizes of the forward search subsets. The MS effects can then be discovered by simply investigating the forward plots of residuals. When the inclusion of an observation affects the model and the diagnostics of other points significantly, the observation is suspected to be an outlier. When necessary, by examining the forward plots of various statistical diagnostics, a deeper understanding of the observation can be acknowledged, for example changes in the values of the coefficients after the observation is included, or changes in the diagnostics of other observations when the suspicious outlier is removed from the data set. The acknowledgement will help in deciding whether the observation is a true outlier, or just a non-outlying observation with relatively high leverage. Through simulation studies and the analysis of seizure data and hormone data, the forward search of the GEE method is shown to be able to provide a wealth of information for guiding both outlier detection and the identification of MS effects. |
Degree | Master of Philosophy |
Subject | Longitudinal method - Statistical methods |
Dept/Program | Statistics and Actuarial Science |
Persistent Identifier | http://hdl.handle.net/10722/219998 |
HKU Library Item ID | b5570813 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Li, Nailin | - |
dc.contributor.author | 李乃霖 | - |
dc.date.accessioned | 2015-10-08T23:12:19Z | - |
dc.date.available | 2015-10-08T23:12:19Z | - |
dc.date.issued | 2015 | - |
dc.identifier.citation | Li, N. [李乃霖]. (2015). Statistical diagnostics for longitudinal data analysis : forward search of the GEE method. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5570813 | - |
dc.identifier.uri | http://hdl.handle.net/10722/219998 | - |
dc.description.abstract | In longitudinal data analysis, masking and swamping (MS) are two common effects that can cause severe problems. Successful identification of MS effects is essential to both outlier detection and longitudinal data analysis because ignorance of the MS effects can make the conclusion of analysis totally meaningless and misleading. In this thesis, a statistical method for analyzing and diagnosing longitudinal data sets is proposed as the forward search of the generalized estimating equation (GEE) method (FSGEE). Starting from an outlier-free initial subset of the data selected using a robust method, FSGEE makes its progress to the next subset by expanding the subset according to the distance of the observations to the GEE model fitted from the current subset. Through monitoring statistical diagnostics during the forward search process, the forward plots are produced by plotting the diagnostics against the sizes of the forward search subsets. The MS effects can then be discovered by simply investigating the forward plots of residuals. When the inclusion of an observation affects the model and the diagnostics of other points significantly, the observation is suspected to be an outlier. When necessary, by examining the forward plots of various statistical diagnostics, a deeper understanding of the observation can be acknowledged, for example changes in the values of the coefficients after the observation is included, or changes in the diagnostics of other observations when the suspicious outlier is removed from the data set. The acknowledgement will help in deciding whether the observation is a true outlier, or just a non-outlying observation with relatively high leverage. Through simulation studies and the analysis of seizure data and hormone data, the forward search of the GEE method is shown to be able to provide a wealth of information for guiding both outlier detection and the identification of MS effects. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Longitudinal method - Statistical methods | - |
dc.title | Statistical diagnostics for longitudinal data analysis : forward search of the GEE method | - |
dc.type | PG_Thesis | - |
dc.identifier.hkul | b5570813 | - |
dc.description.thesisname | Master of Philosophy | - |
dc.description.thesislevel | Master | - |
dc.description.thesisdiscipline | Statistics and Actuarial Science | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.5353/th_b5570813 | - |
dc.identifier.mmsid | 991011109599703414 | - |