File Download
Supplementary

postgraduate thesis: Neural networks meet applied mathematics : GANs, PINNs, and transformers

TitleNeural networks meet applied mathematics : GANs, PINNs, and transformers
Authors
Issue Date2024
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Gao, Y. [高伊杭]. (2024). Neural networks meet applied mathematics : GANs, PINNs, and transformers. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractInspired by the appealing applications and remarkable performances of deep neural networks, specifically the high-quality image generation of generative adversarial networks (GANs), physics-informed neural networks (PINNs) as solvers of partial differential equations (PDEs), and algorithmic expressiveness of transformers in in-context learning (ICL), there has been a growing interest in mathematically understanding neural networks. The thesis endeavors to explore the topics of neural networks and applied mathematics, contributing to the mathematical understanding of deep learning models in the following examples. Chapter 2 of the thesis studies the generalization of training Wasserstein generative adversarial networks (WGANs), a well-behaved variant of GANs. Besides the convincing evidence of the success of training WGANs, the derived generalization error bound provides useful theoretical insights to guide the practical implementation of WGANs. Specifically, WGANs have a higher requirement for the capacity of discriminators than that of generators. More importantly, the results with overly deep and wide (high-capacity) generators may be worse than those with low-capacity generators if discriminators are insufficiently strong. Chapter 3 delves into improving the training of GANs by advancing the optimization methods. The thesis develops the Hessian-based Follow-the-Ridge algorithm by incorporating Hessian information for acceleration. Chapters 4-6 focus on analysing PINNs, which are deep learning tools for solving PDEs-related problems. The empirical loss of PINNs admits a highly non-convex and nonlinear landscape that is usually difficult to optimize from classical optimization theory. However, (stochastic) gradient descent-like optimizers are popular and widely adopted in practice. Our analysis tries to explain the phenomenon from the overparameterization perspective and shows that the vanilla gradient descent can find the global optima of two-layer PINNs. In addition, we design a transfer learning method of PINNs based on singular value decomposition for solving a class of PDEs with lower training costs and storage. Moreover, the thesis proposes physics-informed WGANs for uncertainty quantification in solutions of PDEs. The method measures the uncertainty of the solutions on the boundary by WGANs and propagates the uncertainty to the interior domain following the demanded physical laws by PINNs. Chapter 7 of the thesis investigates the transformers' expressiveness in in-context learning and proposes an algorithm-structure-regularized transformer. To empower transformers with algorithmic capabilities and motivated by the recently proposed looped transformer, we design a novel transformer block, dubbed Algorithm Transformer (abbreviated as AlgoFormer). Compared with the standard transformer and vanilla looped transformer, the proposed AlgoFormer can represent some algorithms more efficiently. Chapter 8 provides discussions on cross-interactions of deep learning and applied mathematics. In summary, this thesis contributes to a comprehensive understanding of some neural networks (e.g., GANs, PINNs, and transformers) from mathematics perspectives (e.g., generalization, expressiveness, and optimization); and adapting neural networks for solving challenging scientific computing problems (e.g., uncertainty quantification in solutions of PDEs). Theoretical analysis in the thesis not only furnishes evidence supporting the efficacy of neural networks but also offers valuable practical guidance for their implementation. Furthermore, novel methods and algorithms developed within the scope of this thesis aim to enhance the overall performance of neural networks in specific domains.
DegreeDoctor of Philosophy
SubjectNeural networks (Computer science)
Dept/ProgramMathematics
Persistent Identifierhttp://hdl.handle.net/10722/345407

 

DC FieldValueLanguage
dc.contributor.authorGao, Yihang-
dc.contributor.author高伊杭-
dc.date.accessioned2024-08-26T08:59:35Z-
dc.date.available2024-08-26T08:59:35Z-
dc.date.issued2024-
dc.identifier.citationGao, Y. [高伊杭]. (2024). Neural networks meet applied mathematics : GANs, PINNs, and transformers. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/345407-
dc.description.abstractInspired by the appealing applications and remarkable performances of deep neural networks, specifically the high-quality image generation of generative adversarial networks (GANs), physics-informed neural networks (PINNs) as solvers of partial differential equations (PDEs), and algorithmic expressiveness of transformers in in-context learning (ICL), there has been a growing interest in mathematically understanding neural networks. The thesis endeavors to explore the topics of neural networks and applied mathematics, contributing to the mathematical understanding of deep learning models in the following examples. Chapter 2 of the thesis studies the generalization of training Wasserstein generative adversarial networks (WGANs), a well-behaved variant of GANs. Besides the convincing evidence of the success of training WGANs, the derived generalization error bound provides useful theoretical insights to guide the practical implementation of WGANs. Specifically, WGANs have a higher requirement for the capacity of discriminators than that of generators. More importantly, the results with overly deep and wide (high-capacity) generators may be worse than those with low-capacity generators if discriminators are insufficiently strong. Chapter 3 delves into improving the training of GANs by advancing the optimization methods. The thesis develops the Hessian-based Follow-the-Ridge algorithm by incorporating Hessian information for acceleration. Chapters 4-6 focus on analysing PINNs, which are deep learning tools for solving PDEs-related problems. The empirical loss of PINNs admits a highly non-convex and nonlinear landscape that is usually difficult to optimize from classical optimization theory. However, (stochastic) gradient descent-like optimizers are popular and widely adopted in practice. Our analysis tries to explain the phenomenon from the overparameterization perspective and shows that the vanilla gradient descent can find the global optima of two-layer PINNs. In addition, we design a transfer learning method of PINNs based on singular value decomposition for solving a class of PDEs with lower training costs and storage. Moreover, the thesis proposes physics-informed WGANs for uncertainty quantification in solutions of PDEs. The method measures the uncertainty of the solutions on the boundary by WGANs and propagates the uncertainty to the interior domain following the demanded physical laws by PINNs. Chapter 7 of the thesis investigates the transformers' expressiveness in in-context learning and proposes an algorithm-structure-regularized transformer. To empower transformers with algorithmic capabilities and motivated by the recently proposed looped transformer, we design a novel transformer block, dubbed Algorithm Transformer (abbreviated as AlgoFormer). Compared with the standard transformer and vanilla looped transformer, the proposed AlgoFormer can represent some algorithms more efficiently. Chapter 8 provides discussions on cross-interactions of deep learning and applied mathematics. In summary, this thesis contributes to a comprehensive understanding of some neural networks (e.g., GANs, PINNs, and transformers) from mathematics perspectives (e.g., generalization, expressiveness, and optimization); and adapting neural networks for solving challenging scientific computing problems (e.g., uncertainty quantification in solutions of PDEs). Theoretical analysis in the thesis not only furnishes evidence supporting the efficacy of neural networks but also offers valuable practical guidance for their implementation. Furthermore, novel methods and algorithms developed within the scope of this thesis aim to enhance the overall performance of neural networks in specific domains.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshNeural networks (Computer science)-
dc.titleNeural networks meet applied mathematics : GANs, PINNs, and transformers-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineMathematics-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2024-
dc.identifier.mmsid991044843668003414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats