File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: A bootstrap procedure for unsupervised classification based on variable and observation resampling
Title | A bootstrap procedure for unsupervised classification based on variable and observation resampling |
---|---|
Authors | |
Issue Date | 2017 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Liu, J. [刘俊宏]. (2017). A bootstrap procedure for unsupervised classification based on variable and observation resampling. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | In this thesis, we first investigate the performance of bagged sparse 2-means clustering using a conventional bagging procedure. We next propose a cross bagging procedure, which bootstraps the variables and observations simultaneously, for sparse unsupervised classification problems. Formulated under a general framework, the procedure admits a variety of plugged-in classifiers, which can be determined case by case. Our procedure comprises two main stages. The main goal in the first stage is to collect class probabilities and variable importance. Another round of modified bagging is implemented in the follow-up stage, aiming to re-classify the original learning set so that the final partition is more accurate than that obtained by a single-round bagging. We illustrate the potential applicability of cross bagging by studying its behavior when sparse 2-means clustering is taken to be the sparse unsupervised classifier and k-nearest-neighbor classifier is taken as the supervised classifier. Several simulations are reported to demonstrate the performances of cross bagged sparse 2-means clustering, which are compared with those of sparse 2-means clustering and its bagged version. The results show that, under the aforementioned setting, cross bagged sparse 2-means clustering approximates Bayes rule the best when n is greater or close to p. However, this conclusion may not be true if n is much smaller than p. |
Degree | Master of Philosophy |
Subject | Bootstrap (Statistics) Cluster analysis Mathematical statistics |
Dept/Program | Statistics and Actuarial Science |
Persistent Identifier | http://hdl.handle.net/10722/239967 |
HKU Library Item ID | b5846396 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Liu, Junhong | - |
dc.contributor.author | 刘俊宏 | - |
dc.date.accessioned | 2017-04-08T23:13:19Z | - |
dc.date.available | 2017-04-08T23:13:19Z | - |
dc.date.issued | 2017 | - |
dc.identifier.citation | Liu, J. [刘俊宏]. (2017). A bootstrap procedure for unsupervised classification based on variable and observation resampling. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/239967 | - |
dc.description.abstract | In this thesis, we first investigate the performance of bagged sparse 2-means clustering using a conventional bagging procedure. We next propose a cross bagging procedure, which bootstraps the variables and observations simultaneously, for sparse unsupervised classification problems. Formulated under a general framework, the procedure admits a variety of plugged-in classifiers, which can be determined case by case. Our procedure comprises two main stages. The main goal in the first stage is to collect class probabilities and variable importance. Another round of modified bagging is implemented in the follow-up stage, aiming to re-classify the original learning set so that the final partition is more accurate than that obtained by a single-round bagging. We illustrate the potential applicability of cross bagging by studying its behavior when sparse 2-means clustering is taken to be the sparse unsupervised classifier and k-nearest-neighbor classifier is taken as the supervised classifier. Several simulations are reported to demonstrate the performances of cross bagged sparse 2-means clustering, which are compared with those of sparse 2-means clustering and its bagged version. The results show that, under the aforementioned setting, cross bagged sparse 2-means clustering approximates Bayes rule the best when n is greater or close to p. However, this conclusion may not be true if n is much smaller than p. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Bootstrap (Statistics) | - |
dc.subject.lcsh | Cluster analysis | - |
dc.subject.lcsh | Mathematical statistics | - |
dc.title | A bootstrap procedure for unsupervised classification based on variable and observation resampling | - |
dc.type | PG_Thesis | - |
dc.identifier.hkul | b5846396 | - |
dc.description.thesisname | Master of Philosophy | - |
dc.description.thesislevel | Master | - |
dc.description.thesisdiscipline | Statistics and Actuarial Science | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.mmsid | 991022013549703414 | - |