File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Neural networks meet applied mathematics : GANs, PINNs, and transformers
Title | Neural networks meet applied mathematics : GANs, PINNs, and transformers |
---|---|
Authors | |
Issue Date | 2024 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Gao, Y. [高伊杭]. (2024). Neural networks meet applied mathematics : GANs, PINNs, and transformers. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Inspired by the appealing applications and remarkable performances of deep neural networks, specifically the high-quality image generation of generative adversarial networks (GANs), physics-informed neural networks (PINNs) as solvers of partial differential equations (PDEs), and algorithmic expressiveness of transformers in in-context learning (ICL), there has been a growing interest in mathematically understanding neural networks. The thesis endeavors to explore the topics of neural networks and applied mathematics, contributing to the mathematical understanding of deep learning models in the following examples.
Chapter 2 of the thesis studies the generalization of training Wasserstein generative adversarial networks (WGANs), a well-behaved variant of GANs. Besides the convincing evidence of the success of training WGANs, the derived generalization error bound provides useful theoretical insights to guide the practical implementation of WGANs. Specifically, WGANs have a higher requirement for the capacity of discriminators than that of generators. More importantly, the results with overly deep and wide (high-capacity) generators may be worse than those with low-capacity generators if discriminators are insufficiently strong. Chapter 3 delves into improving the training of GANs by advancing the optimization methods. The thesis develops the Hessian-based Follow-the-Ridge algorithm by incorporating Hessian information for acceleration.
Chapters 4-6 focus on analysing PINNs, which are deep learning tools for solving PDEs-related problems. The empirical loss of PINNs admits a highly non-convex and nonlinear landscape that is usually difficult to optimize from classical optimization theory. However, (stochastic) gradient descent-like optimizers are popular and widely adopted in practice. Our analysis tries to explain the phenomenon from the overparameterization perspective and shows that the vanilla gradient descent can find the global optima of two-layer PINNs. In addition, we design a transfer learning method of PINNs based on singular value decomposition for solving a class of PDEs with lower training costs and storage. Moreover, the thesis proposes physics-informed WGANs for uncertainty quantification in solutions of PDEs. The method measures the uncertainty of the solutions on the boundary by WGANs and propagates the uncertainty to the interior domain following the demanded physical laws by PINNs.
Chapter 7 of the thesis investigates the transformers' expressiveness in in-context learning and proposes an algorithm-structure-regularized transformer. To empower transformers with algorithmic capabilities and motivated by the recently proposed looped transformer, we design a novel transformer block, dubbed Algorithm Transformer (abbreviated as AlgoFormer). Compared with the standard transformer and vanilla looped transformer, the proposed AlgoFormer can represent some algorithms more efficiently.
Chapter 8 provides discussions on cross-interactions of deep learning and applied mathematics.
In summary, this thesis contributes to a comprehensive understanding of some neural networks (e.g., GANs, PINNs, and transformers) from mathematics perspectives (e.g., generalization, expressiveness, and optimization); and adapting neural networks for solving challenging scientific computing problems (e.g., uncertainty quantification in solutions of PDEs). Theoretical analysis in the thesis not only furnishes evidence supporting the efficacy of neural networks but also offers valuable practical guidance for their implementation. Furthermore, novel methods and algorithms developed within the scope of this thesis aim to enhance the overall performance of neural networks in specific domains. |
Degree | Doctor of Philosophy |
Subject | Neural networks (Computer science) |
Dept/Program | Mathematics |
Persistent Identifier | http://hdl.handle.net/10722/345407 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Gao, Yihang | - |
dc.contributor.author | 高伊杭 | - |
dc.date.accessioned | 2024-08-26T08:59:35Z | - |
dc.date.available | 2024-08-26T08:59:35Z | - |
dc.date.issued | 2024 | - |
dc.identifier.citation | Gao, Y. [高伊杭]. (2024). Neural networks meet applied mathematics : GANs, PINNs, and transformers. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/345407 | - |
dc.description.abstract | Inspired by the appealing applications and remarkable performances of deep neural networks, specifically the high-quality image generation of generative adversarial networks (GANs), physics-informed neural networks (PINNs) as solvers of partial differential equations (PDEs), and algorithmic expressiveness of transformers in in-context learning (ICL), there has been a growing interest in mathematically understanding neural networks. The thesis endeavors to explore the topics of neural networks and applied mathematics, contributing to the mathematical understanding of deep learning models in the following examples. Chapter 2 of the thesis studies the generalization of training Wasserstein generative adversarial networks (WGANs), a well-behaved variant of GANs. Besides the convincing evidence of the success of training WGANs, the derived generalization error bound provides useful theoretical insights to guide the practical implementation of WGANs. Specifically, WGANs have a higher requirement for the capacity of discriminators than that of generators. More importantly, the results with overly deep and wide (high-capacity) generators may be worse than those with low-capacity generators if discriminators are insufficiently strong. Chapter 3 delves into improving the training of GANs by advancing the optimization methods. The thesis develops the Hessian-based Follow-the-Ridge algorithm by incorporating Hessian information for acceleration. Chapters 4-6 focus on analysing PINNs, which are deep learning tools for solving PDEs-related problems. The empirical loss of PINNs admits a highly non-convex and nonlinear landscape that is usually difficult to optimize from classical optimization theory. However, (stochastic) gradient descent-like optimizers are popular and widely adopted in practice. Our analysis tries to explain the phenomenon from the overparameterization perspective and shows that the vanilla gradient descent can find the global optima of two-layer PINNs. In addition, we design a transfer learning method of PINNs based on singular value decomposition for solving a class of PDEs with lower training costs and storage. Moreover, the thesis proposes physics-informed WGANs for uncertainty quantification in solutions of PDEs. The method measures the uncertainty of the solutions on the boundary by WGANs and propagates the uncertainty to the interior domain following the demanded physical laws by PINNs. Chapter 7 of the thesis investigates the transformers' expressiveness in in-context learning and proposes an algorithm-structure-regularized transformer. To empower transformers with algorithmic capabilities and motivated by the recently proposed looped transformer, we design a novel transformer block, dubbed Algorithm Transformer (abbreviated as AlgoFormer). Compared with the standard transformer and vanilla looped transformer, the proposed AlgoFormer can represent some algorithms more efficiently. Chapter 8 provides discussions on cross-interactions of deep learning and applied mathematics. In summary, this thesis contributes to a comprehensive understanding of some neural networks (e.g., GANs, PINNs, and transformers) from mathematics perspectives (e.g., generalization, expressiveness, and optimization); and adapting neural networks for solving challenging scientific computing problems (e.g., uncertainty quantification in solutions of PDEs). Theoretical analysis in the thesis not only furnishes evidence supporting the efficacy of neural networks but also offers valuable practical guidance for their implementation. Furthermore, novel methods and algorithms developed within the scope of this thesis aim to enhance the overall performance of neural networks in specific domains. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Neural networks (Computer science) | - |
dc.title | Neural networks meet applied mathematics : GANs, PINNs, and transformers | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Mathematics | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2024 | - |
dc.identifier.mmsid | 991044843668003414 | - |