Intrinsically interpretable machine learning models and automated hyperparameter optimization

Yang, Zebin; 杨泽斌

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Statistics & Actuarial Science: Theses

postgraduate thesis: Intrinsically interpretable machine learning models and automated hyperparameter optimization

Title	Intrinsically interpretable machine learning models and automated hyperparameter optimization
Authors	Yang, Zebin 杨泽斌
Advisors	Advisor(s):Yin, G
Issue Date	2021
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Yang, Z. [杨泽斌]. (2021). Intrinsically interpretable machine learning models and automated hyperparameter optimization. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Prediction accuracy and model interpretability are the two most important objectives when developing machine learning algorithms. Neural networks and ensemble trees are known to possess good prediction performance but suffer from the lack of model interpretability. In this thesis, three intrinsically interpretable machine learning models are proposed, including an enhanced explainable neural network (ExNN), an explainable neural network based on generalized additive models with structured interactions (GAMI-Net), and a single-index model tree (SIMTree). All these three models are validated through extensive experiments, which show their superior performance for balancing prediction performance and model interpretability. Moreover, a sequential uniform design (SeqUD) approach is proposed for hyperparameter optimization, which can help a machine learning model to achieve maximum possible predictive performance. In ExNN, the explainability of neural networks is enhanced through the following architecture constraints: a) sparse additive subnetworks; b) projection pursuit with orthogonality constraint; c) smooth function approximation. It leads to a superior balance between prediction performance and model interpretability. The multiple parameters are simultaneously estimated by a modified mini-batch gradient descent method based on the backpropagation algorithm for calculating the derivatives and the Cayley transform for preserving the projection orthogonality. GAMI-Net is a disentangled feedforward network with multiple additive subnetworks. Each subnetwork consists of multiple hidden layers and is designed for capturing one main effect or one pairwise interaction. Three interpretability aspects are further considered, including a) sparsity, to select the most significant effects for parsimonious representations; b) heredity, a pairwise interaction could only be included when at least one of its parent main effects exists; c) marginal clarity, to make main effects and pairwise interactions mutually distinguishable. An adaptive training algorithm is developed, where main effects are first trained and then pairwise interactions are fitted to the residuals. SIMTree is developed for heterogeneous data modeling. It adopts the recursive partitioning strategy and each data segment is modeled by a single-index model (SIM), which is a flexible extension of linear regression with non-parametric link functions. The proposed SIMTree has two major advantages: a) with only a few leaf nodes, it can achieve competitive predictive performance compared to complicated black-box models; b) SIMs fitted on each local data segment are intrinsically interpretable. To make the computation burden affordable, an effective training algorithm is proposed as enabled by the efficient utilization of Stein's lemma and several accelerating strategies in the tree construction algorithm. Finally, this thesis reformulates hyperparameter optimization as a computer experiment and proposes a novel SeqUD strategy with three-fold advantages: a) the hyperparameter space is adaptively explored with evenly spread design points, without the need of expensive meta-modeling and acquisition optimization; b) the batch-by-batch design points are sequentially generated with parallel processing support; c) a new augmented uniform design algorithm is developed for the efficient real-time generation of follow-up design points. The superior performance of SeqUD is validated via both global optimization tasks and real applications.
Degree	Doctor of Philosophy
Subject	Machine learning Mathematical optimization
Dept/Program	Statistics and Actuarial Science
Persistent Identifier	http://hdl.handle.net/10722/308636

DC Field	Value	Language
dc.contributor.advisor	Yin, G	-
dc.contributor.author	Yang, Zebin	-
dc.contributor.author	杨泽斌	-
dc.date.accessioned	2021-12-06T01:04:02Z	-
dc.date.available	2021-12-06T01:04:02Z	-
dc.date.issued	2021	-
dc.identifier.citation	Yang, Z. [杨泽斌]. (2021). Intrinsically interpretable machine learning models and automated hyperparameter optimization. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/308636	-
dc.description.abstract	Prediction accuracy and model interpretability are the two most important objectives when developing machine learning algorithms. Neural networks and ensemble trees are known to possess good prediction performance but suffer from the lack of model interpretability. In this thesis, three intrinsically interpretable machine learning models are proposed, including an enhanced explainable neural network (ExNN), an explainable neural network based on generalized additive models with structured interactions (GAMI-Net), and a single-index model tree (SIMTree). All these three models are validated through extensive experiments, which show their superior performance for balancing prediction performance and model interpretability. Moreover, a sequential uniform design (SeqUD) approach is proposed for hyperparameter optimization, which can help a machine learning model to achieve maximum possible predictive performance. In ExNN, the explainability of neural networks is enhanced through the following architecture constraints: a) sparse additive subnetworks; b) projection pursuit with orthogonality constraint; c) smooth function approximation. It leads to a superior balance between prediction performance and model interpretability. The multiple parameters are simultaneously estimated by a modified mini-batch gradient descent method based on the backpropagation algorithm for calculating the derivatives and the Cayley transform for preserving the projection orthogonality. GAMI-Net is a disentangled feedforward network with multiple additive subnetworks. Each subnetwork consists of multiple hidden layers and is designed for capturing one main effect or one pairwise interaction. Three interpretability aspects are further considered, including a) sparsity, to select the most significant effects for parsimonious representations; b) heredity, a pairwise interaction could only be included when at least one of its parent main effects exists; c) marginal clarity, to make main effects and pairwise interactions mutually distinguishable. An adaptive training algorithm is developed, where main effects are first trained and then pairwise interactions are fitted to the residuals. SIMTree is developed for heterogeneous data modeling. It adopts the recursive partitioning strategy and each data segment is modeled by a single-index model (SIM), which is a flexible extension of linear regression with non-parametric link functions. The proposed SIMTree has two major advantages: a) with only a few leaf nodes, it can achieve competitive predictive performance compared to complicated black-box models; b) SIMs fitted on each local data segment are intrinsically interpretable. To make the computation burden affordable, an effective training algorithm is proposed as enabled by the efficient utilization of Stein's lemma and several accelerating strategies in the tree construction algorithm. Finally, this thesis reformulates hyperparameter optimization as a computer experiment and proposes a novel SeqUD strategy with three-fold advantages: a) the hyperparameter space is adaptively explored with evenly spread design points, without the need of expensive meta-modeling and acquisition optimization; b) the batch-by-batch design points are sequentially generated with parallel processing support; c) a new augmented uniform design algorithm is developed for the efficient real-time generation of follow-up design points. The superior performance of SeqUD is validated via both global optimization tasks and real applications.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Machine learning	-
dc.subject.lcsh	Mathematical optimization	-
dc.title	Intrinsically interpretable machine learning models and automated hyperparameter optimization	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Statistics and Actuarial Science	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2021	-
dc.identifier.mmsid	991044448916903414	-

File Download

Supplementary

postgraduate thesis: Intrinsically interpretable machine learning models and automated hyperparameter optimization

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats