Quantile index regression and time series applications in machine learning

Zhao, Jingyu; 趙婧妤

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Statistics & Actuarial Science: Theses

postgraduate thesis: Quantile index regression and time series applications in machine learning

Title	Quantile index regression and time series applications in machine learning
Authors	Zhao, Jingyu 趙婧妤
Issue Date	2022
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Zhao, J. [趙婧妤]. (2022). Quantile index regression and time series applications in machine learning. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Modern technologies have enabled the collection of massive complex and high-dimensional data, which calls for more powerful data analysis tools. One direction is to extend the flexibility and applicability of traditional statistical models. And the other direction is to resort to neural network models, which are more flexible but less interpretable. Some statistical theories can be applied to improve people's understanding of those black-box models and to propose theory-inspired improvements. This thesis studies a new high-dimensional and nonlinear quantile regression model and two neural network architectures inspired by time series analysis. Firstly, a novel partially parametric quantile regression model is proposed, called the quantile index regression. Quantile regression is a class of flexible regression models that can be used to study the structures of response at any quantile level. It is usually difficult for quantile regression to make inference at levels with sparse observations, such as high and low quantiles, and the situation becomes more serious for high-dimensional data. Our model circumvents the data sparsity problem by alternatively conducting the estimating procedure at levels with rich observations and then extrapolating the fitted structures to levels with sparse observations. Asymptotic properties are derived for the case with fixed dimensions, and non-asymptotic error bounds are established for high-dimensional data. Simulation experiments and a real data example demonstrate the usefulness of the proposed model over existing methods. Secondly, this thesis studies the focal problem of the ability of recurrent neural networks (RNNs) to learn long-term dependency. Among RNN variants, the LSTM network was proposed to overcome the difficulty in learning long-term dependence and has made significant advancements in applications. However, we prove that RNN and LSTM do not have long memory from a statistical perspective. As a result, a new definition for long memory networks is further introduced, and it requires the model weights to decay at a polynomial rate. To verify the theoretical results, RNN and LSTM are converted into long memory networks by making minimal modifications, and their superiority is illustrated in modeling the long-term dependence of various datasets. Lastly, this thesis utilizes time series modeling techniques and proposes a recurrent layer aggregation module to improve deep convolutional neural networks for computer vision tasks. A concept of layer aggregation is introduced to describe how information from previous layers can be reused to better extract features at the current layer in a CNN. By making use of the sequential structure of layers in a deep CNN, this thesis proposes a very light-weighted module, called recurrent layer aggregation (RLA). The RLA module is compatible with many mainstream deep CNNs, including ResNets, Xception and MobileNetV2, and its effectiveness is verified by extensive experiments on image classification, object detection and instance segmentation tasks.
Degree	Doctor of Philosophy
Subject	Quantile regression Time-series analysis Machine learning
Dept/Program	Statistics and Actuarial Science
Persistent Identifier	http://hdl.handle.net/10722/325742

DC Field	Value	Language
dc.contributor.author	Zhao, Jingyu	-
dc.contributor.author	趙婧妤	-
dc.date.accessioned	2023-03-02T16:32:28Z	-
dc.date.available	2023-03-02T16:32:28Z	-
dc.date.issued	2022	-
dc.identifier.citation	Zhao, J. [趙婧妤]. (2022). Quantile index regression and time series applications in machine learning. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/325742	-
dc.description.abstract	Modern technologies have enabled the collection of massive complex and high-dimensional data, which calls for more powerful data analysis tools. One direction is to extend the flexibility and applicability of traditional statistical models. And the other direction is to resort to neural network models, which are more flexible but less interpretable. Some statistical theories can be applied to improve people's understanding of those black-box models and to propose theory-inspired improvements. This thesis studies a new high-dimensional and nonlinear quantile regression model and two neural network architectures inspired by time series analysis. Firstly, a novel partially parametric quantile regression model is proposed, called the quantile index regression. Quantile regression is a class of flexible regression models that can be used to study the structures of response at any quantile level. It is usually difficult for quantile regression to make inference at levels with sparse observations, such as high and low quantiles, and the situation becomes more serious for high-dimensional data. Our model circumvents the data sparsity problem by alternatively conducting the estimating procedure at levels with rich observations and then extrapolating the fitted structures to levels with sparse observations. Asymptotic properties are derived for the case with fixed dimensions, and non-asymptotic error bounds are established for high-dimensional data. Simulation experiments and a real data example demonstrate the usefulness of the proposed model over existing methods. Secondly, this thesis studies the focal problem of the ability of recurrent neural networks (RNNs) to learn long-term dependency. Among RNN variants, the LSTM network was proposed to overcome the difficulty in learning long-term dependence and has made significant advancements in applications. However, we prove that RNN and LSTM do not have long memory from a statistical perspective. As a result, a new definition for long memory networks is further introduced, and it requires the model weights to decay at a polynomial rate. To verify the theoretical results, RNN and LSTM are converted into long memory networks by making minimal modifications, and their superiority is illustrated in modeling the long-term dependence of various datasets. Lastly, this thesis utilizes time series modeling techniques and proposes a recurrent layer aggregation module to improve deep convolutional neural networks for computer vision tasks. A concept of layer aggregation is introduced to describe how information from previous layers can be reused to better extract features at the current layer in a CNN. By making use of the sequential structure of layers in a deep CNN, this thesis proposes a very light-weighted module, called recurrent layer aggregation (RLA). The RLA module is compatible with many mainstream deep CNNs, including ResNets, Xception and MobileNetV2, and its effectiveness is verified by extensive experiments on image classification, object detection and instance segmentation tasks.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Quantile regression	-
dc.subject.lcsh	Time-series analysis	-
dc.subject.lcsh	Machine learning	-
dc.title	Quantile index regression and time series applications in machine learning	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Statistics and Actuarial Science	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2022	-
dc.identifier.mmsid	991044649904103414	-

File Download

Supplementary

postgraduate thesis: Quantile index regression and time series applications in machine learning

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats