Neural networks meet applied mathematics : GANs, PINNs, and transformers

Gao, Yihang; 高伊杭

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Mathematics: Theses

postgraduate thesis: Neural networks meet applied mathematics : GANs, PINNs, and transformers

Title	Neural networks meet applied mathematics : GANs, PINNs, and transformers
Authors	Gao, Yihang 高伊杭
Issue Date	2024
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Gao, Y. [高伊杭]. (2024). Neural networks meet applied mathematics : GANs, PINNs, and transformers. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Inspired by the appealing applications and remarkable performances of deep neural networks, specifically the high-quality image generation of generative adversarial networks (GANs), physics-informed neural networks (PINNs) as solvers of partial differential equations (PDEs), and algorithmic expressiveness of transformers in in-context learning (ICL), there has been a growing interest in mathematically understanding neural networks. The thesis endeavors to explore the topics of neural networks and applied mathematics, contributing to the mathematical understanding of deep learning models in the following examples. Chapter 2 of the thesis studies the generalization of training Wasserstein generative adversarial networks (WGANs), a well-behaved variant of GANs. Besides the convincing evidence of the success of training WGANs, the derived generalization error bound provides useful theoretical insights to guide the practical implementation of WGANs. Specifically, WGANs have a higher requirement for the capacity of discriminators than that of generators. More importantly, the results with overly deep and wide (high-capacity) generators may be worse than those with low-capacity generators if discriminators are insufficiently strong. Chapter 3 delves into improving the training of GANs by advancing the optimization methods. The thesis develops the Hessian-based Follow-the-Ridge algorithm by incorporating Hessian information for acceleration. Chapters 4-6 focus on analysing PINNs, which are deep learning tools for solving PDEs-related problems. The empirical loss of PINNs admits a highly non-convex and nonlinear landscape that is usually difficult to optimize from classical optimization theory. However, (stochastic) gradient descent-like optimizers are popular and widely adopted in practice. Our analysis tries to explain the phenomenon from the overparameterization perspective and shows that the vanilla gradient descent can find the global optima of two-layer PINNs. In addition, we design a transfer learning method of PINNs based on singular value decomposition for solving a class of PDEs with lower training costs and storage. Moreover, the thesis proposes physics-informed WGANs for uncertainty quantification in solutions of PDEs. The method measures the uncertainty of the solutions on the boundary by WGANs and propagates the uncertainty to the interior domain following the demanded physical laws by PINNs. Chapter 7 of the thesis investigates the transformers' expressiveness in in-context learning and proposes an algorithm-structure-regularized transformer. To empower transformers with algorithmic capabilities and motivated by the recently proposed looped transformer, we design a novel transformer block, dubbed Algorithm Transformer (abbreviated as AlgoFormer). Compared with the standard transformer and vanilla looped transformer, the proposed AlgoFormer can represent some algorithms more efficiently. Chapter 8 provides discussions on cross-interactions of deep learning and applied mathematics. In summary, this thesis contributes to a comprehensive understanding of some neural networks (e.g., GANs, PINNs, and transformers) from mathematics perspectives (e.g., generalization, expressiveness, and optimization); and adapting neural networks for solving challenging scientific computing problems (e.g., uncertainty quantification in solutions of PDEs). Theoretical analysis in the thesis not only furnishes evidence supporting the efficacy of neural networks but also offers valuable practical guidance for their implementation. Furthermore, novel methods and algorithms developed within the scope of this thesis aim to enhance the overall performance of neural networks in specific domains.
Degree	Doctor of Philosophy
Subject	Neural networks (Computer science)
Dept/Program	Mathematics
Persistent Identifier	http://hdl.handle.net/10722/345407

DC Field	Value	Language
dc.contributor.author	Gao, Yihang	-
dc.contributor.author	高伊杭	-
dc.date.accessioned	2024-08-26T08:59:35Z	-
dc.date.available	2024-08-26T08:59:35Z	-
dc.date.issued	2024	-
dc.identifier.citation	Gao, Y. [高伊杭]. (2024). Neural networks meet applied mathematics : GANs, PINNs, and transformers. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/345407	-
dc.description.abstract	Inspired by the appealing applications and remarkable performances of deep neural networks, specifically the high-quality image generation of generative adversarial networks (GANs), physics-informed neural networks (PINNs) as solvers of partial differential equations (PDEs), and algorithmic expressiveness of transformers in in-context learning (ICL), there has been a growing interest in mathematically understanding neural networks. The thesis endeavors to explore the topics of neural networks and applied mathematics, contributing to the mathematical understanding of deep learning models in the following examples. Chapter 2 of the thesis studies the generalization of training Wasserstein generative adversarial networks (WGANs), a well-behaved variant of GANs. Besides the convincing evidence of the success of training WGANs, the derived generalization error bound provides useful theoretical insights to guide the practical implementation of WGANs. Specifically, WGANs have a higher requirement for the capacity of discriminators than that of generators. More importantly, the results with overly deep and wide (high-capacity) generators may be worse than those with low-capacity generators if discriminators are insufficiently strong. Chapter 3 delves into improving the training of GANs by advancing the optimization methods. The thesis develops the Hessian-based Follow-the-Ridge algorithm by incorporating Hessian information for acceleration. Chapters 4-6 focus on analysing PINNs, which are deep learning tools for solving PDEs-related problems. The empirical loss of PINNs admits a highly non-convex and nonlinear landscape that is usually difficult to optimize from classical optimization theory. However, (stochastic) gradient descent-like optimizers are popular and widely adopted in practice. Our analysis tries to explain the phenomenon from the overparameterization perspective and shows that the vanilla gradient descent can find the global optima of two-layer PINNs. In addition, we design a transfer learning method of PINNs based on singular value decomposition for solving a class of PDEs with lower training costs and storage. Moreover, the thesis proposes physics-informed WGANs for uncertainty quantification in solutions of PDEs. The method measures the uncertainty of the solutions on the boundary by WGANs and propagates the uncertainty to the interior domain following the demanded physical laws by PINNs. Chapter 7 of the thesis investigates the transformers' expressiveness in in-context learning and proposes an algorithm-structure-regularized transformer. To empower transformers with algorithmic capabilities and motivated by the recently proposed looped transformer, we design a novel transformer block, dubbed Algorithm Transformer (abbreviated as AlgoFormer). Compared with the standard transformer and vanilla looped transformer, the proposed AlgoFormer can represent some algorithms more efficiently. Chapter 8 provides discussions on cross-interactions of deep learning and applied mathematics. In summary, this thesis contributes to a comprehensive understanding of some neural networks (e.g., GANs, PINNs, and transformers) from mathematics perspectives (e.g., generalization, expressiveness, and optimization); and adapting neural networks for solving challenging scientific computing problems (e.g., uncertainty quantification in solutions of PDEs). Theoretical analysis in the thesis not only furnishes evidence supporting the efficacy of neural networks but also offers valuable practical guidance for their implementation. Furthermore, novel methods and algorithms developed within the scope of this thesis aim to enhance the overall performance of neural networks in specific domains.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Neural networks (Computer science)	-
dc.title	Neural networks meet applied mathematics : GANs, PINNs, and transformers	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Mathematics	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2024	-
dc.identifier.mmsid	991044843668003414	-

File Download

Supplementary

postgraduate thesis: Neural networks meet applied mathematics : GANs, PINNs, and transformers

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats