Generalization analysis and regularization in over-parameterized models

Meng, Xuran; 孟徐然

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Statistics & Actuarial Science: Theses

postgraduate thesis: Generalization analysis and regularization in over-parameterized models

Title	Generalization analysis and regularization in over-parameterized models
Authors	Meng, Xuran 孟徐然
Issue Date	2024
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Meng, X. [孟徐然]. (2024). Generalization analysis and regularization in over-parameterized models. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	We study the success of over parameterized models in both regression and classification tasks. In the regression task, we uncover the phenomenon of multiple descent in random feature models, where the test accuracy follows a curve with multiple descents as the number of model parameters increases. In the classification task, we theoretically establish the capability of two-layer ReLU convolutional neural networks to learn complex XOR data. We find that these networks can achieve the Bayes optimal test accuracy when the data signal-to-noise ratio (SNR) is high. Through our theoretical investigations, we discover that benign overfitting only occurs when the data set has a high SNR. Models trained on low SNR data consistently exhibit poor test performance, indicating harmful overfitting of the training data set. We also explore two regularization techniques which can address the issue of harmful overfitting in low SNR data sets for over parameterized models. Firstly, we investigate gradient regularization and its role during the training process. Our theoretical analysis reveals that gradient regularization can effectively suppress the memorization of noise within the model. Consequently, the models with gradient regularization exhibit improved performance in signal learning compared to models without this regularization technique. Secondly, we explore the use of early stopping as a regularization technique. By observing the spectra of weight matrices during the training procedure, researchers identify deviations from the Marchenko-Pastur law. We found that these deviations indicate the presence of sufficient training information or potential issues. As a result, we propose a spectra criterion that can guide the early stopping process during training. Overall, this thesis highlights our investigations into the success of over parameterized models in various learning tasks. We provide insights into the conditions under which these models perform well, and investigate several regularization techniques which can mitigate the harmful overfitting.
Degree	Doctor of Philosophy
Subject	Machine learning Data mining
Dept/Program	Statistics and Actuarial Science
Persistent Identifier	http://hdl.handle.net/10722/345401

DC Field	Value	Language
dc.contributor.author	Meng, Xuran	-
dc.contributor.author	孟徐然	-
dc.date.accessioned	2024-08-26T08:59:32Z	-
dc.date.available	2024-08-26T08:59:32Z	-
dc.date.issued	2024	-
dc.identifier.citation	Meng, X. [孟徐然]. (2024). Generalization analysis and regularization in over-parameterized models. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/345401	-
dc.description.abstract	We study the success of over parameterized models in both regression and classification tasks. In the regression task, we uncover the phenomenon of multiple descent in random feature models, where the test accuracy follows a curve with multiple descents as the number of model parameters increases. In the classification task, we theoretically establish the capability of two-layer ReLU convolutional neural networks to learn complex XOR data. We find that these networks can achieve the Bayes optimal test accuracy when the data signal-to-noise ratio (SNR) is high. Through our theoretical investigations, we discover that benign overfitting only occurs when the data set has a high SNR. Models trained on low SNR data consistently exhibit poor test performance, indicating harmful overfitting of the training data set. We also explore two regularization techniques which can address the issue of harmful overfitting in low SNR data sets for over parameterized models. Firstly, we investigate gradient regularization and its role during the training process. Our theoretical analysis reveals that gradient regularization can effectively suppress the memorization of noise within the model. Consequently, the models with gradient regularization exhibit improved performance in signal learning compared to models without this regularization technique. Secondly, we explore the use of early stopping as a regularization technique. By observing the spectra of weight matrices during the training procedure, researchers identify deviations from the Marchenko-Pastur law. We found that these deviations indicate the presence of sufficient training information or potential issues. As a result, we propose a spectra criterion that can guide the early stopping process during training. Overall, this thesis highlights our investigations into the success of over parameterized models in various learning tasks. We provide insights into the conditions under which these models perform well, and investigate several regularization techniques which can mitigate the harmful overfitting.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Machine learning	-
dc.subject.lcsh	Data mining	-
dc.title	Generalization analysis and regularization in over-parameterized models	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Statistics and Actuarial Science	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2024	-
dc.identifier.mmsid	991044843665703414	-

File Download

Supplementary

postgraduate thesis: Generalization analysis and regularization in over-parameterized models

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats