On incomplete multinomial data modeling and interactive neural and statistical computing with GPU

Dong, Fanghu; 東方虎

File Download

FullText.pdf

Links for fulltext

(May Require Subscription)

DOI: 10.5353/th_991044040572603414

Supplementary

Citations:
Appears in Collections:
- Statistics & Actuarial Science: Theses
- HKU Theses Online

postgraduate thesis: On incomplete multinomial data modeling and interactive neural and statistical computing with GPU

Title	On incomplete multinomial data modeling and interactive neural and statistical computing with GPU
Authors	Dong, Fanghu 東方虎
Advisors	Advisor(s):Yin, G Tian, G
Issue Date	2018
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Dong, F. [東方虎]. (2018). On incomplete multinomial data modeling and interactive neural and statistical computing with GPU. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	This thesis consists of five chapters describing three independent works. Chapter 1 introduces the thesis and fills some necessary background. Chapter 2 describes the first work on incomplete multinomial model for count data sampled on a random partition. It contains a solution to the estimation problem by an iterative algorithm. The incomplete multinomial likelihood is parameterized by the complete-cell probabilities from the most refined partition. Its sufficient statistics include the variable-cell formation observed as an indicator matrix and all cell counts. With externally imposed structures on the cell formation process, it reduces to special models. The weaver algorithm enjoys the ascent property and has a linear rate of convergence. Its steps are short and amenable to a parallel implementation. It is significantly faster than the state-of-the-art EM/MM algorithm when fitting the Plackett--Luce model to a benchmark data set. The chapter also develops an analytic theory to investigate the conditions surrounding the global maximization of the likelihood. Simulation experiments are designed to show the model and algorithm's performance on recovering very weak signals. Asymptotic properties of the estimator are derived and validated with simulations. The next two chapters both design and implement softwares that combine the spreadsheet software's highly interactive user interface with the Graphics Processing Unit's high computing performance. Chapter 3 presents a general design of an interactive neural network trainer. Its main features include the abilities to specify different transfer functions, loss functions, and learning algorithms, facilities for stepping the learning course and tracking user defined variables, and a mechanism to specify constraints for the weights. It also includes a forward selection algorithm for optimizing the network architecture. Chapter 4 implements a dynamic-link library of GPU-executed matrix functions that can be called on the spreadsheet. It then demonstrates the implementation of an interactive software for multivariate statistical analysis utilizing the GPU matrix library. Chapter 5 makes concluding remarks and lists some potential directions for future works.
Degree	Doctor of Philosophy
Subject	Parameter estimation Estimation theory Sampling (Statistics) Iterative methods (Mathematics) Mathematical optimization Graphics processing units
Dept/Program	Statistics and Actuarial Science
Persistent Identifier	http://hdl.handle.net/10722/261493

DC Field	Value	Language
dc.contributor.advisor	Yin, G	-
dc.contributor.advisor	Tian, G	-
dc.contributor.author	Dong, Fanghu	-
dc.contributor.author	東方虎	-
dc.date.accessioned	2018-09-20T06:43:56Z	-
dc.date.available	2018-09-20T06:43:56Z	-
dc.date.issued	2018	-
dc.identifier.citation	Dong, F. [東方虎]. (2018). On incomplete multinomial data modeling and interactive neural and statistical computing with GPU. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/261493	-
dc.description.abstract	This thesis consists of five chapters describing three independent works. Chapter 1 introduces the thesis and fills some necessary background. Chapter 2 describes the first work on incomplete multinomial model for count data sampled on a random partition. It contains a solution to the estimation problem by an iterative algorithm. The incomplete multinomial likelihood is parameterized by the complete-cell probabilities from the most refined partition. Its sufficient statistics include the variable-cell formation observed as an indicator matrix and all cell counts. With externally imposed structures on the cell formation process, it reduces to special models. The weaver algorithm enjoys the ascent property and has a linear rate of convergence. Its steps are short and amenable to a parallel implementation. It is significantly faster than the state-of-the-art EM/MM algorithm when fitting the Plackett--Luce model to a benchmark data set. The chapter also develops an analytic theory to investigate the conditions surrounding the global maximization of the likelihood. Simulation experiments are designed to show the model and algorithm's performance on recovering very weak signals. Asymptotic properties of the estimator are derived and validated with simulations. The next two chapters both design and implement softwares that combine the spreadsheet software's highly interactive user interface with the Graphics Processing Unit's high computing performance. Chapter 3 presents a general design of an interactive neural network trainer. Its main features include the abilities to specify different transfer functions, loss functions, and learning algorithms, facilities for stepping the learning course and tracking user defined variables, and a mechanism to specify constraints for the weights. It also includes a forward selection algorithm for optimizing the network architecture. Chapter 4 implements a dynamic-link library of GPU-executed matrix functions that can be called on the spreadsheet. It then demonstrates the implementation of an interactive software for multivariate statistical analysis utilizing the GPU matrix library. Chapter 5 makes concluding remarks and lists some potential directions for future works.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Parameter estimation	-
dc.subject.lcsh	Estimation theory	-
dc.subject.lcsh	Sampling (Statistics)	-
dc.subject.lcsh	Iterative methods (Mathematics)	-
dc.subject.lcsh	Mathematical optimization	-
dc.subject.lcsh	Graphics processing units	-
dc.title	On incomplete multinomial data modeling and interactive neural and statistical computing with GPU	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Statistics and Actuarial Science	-
dc.description.nature	published_or_final_version	-
dc.identifier.doi	10.5353/th_991044040572603414	-
dc.date.hkucongregation	2018	-
dc.identifier.mmsid	991044040572603414	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

postgraduate thesis: On incomplete multinomial data modeling and interactive neural and statistical computing with GPU

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats