File Download
Supplementary

postgraduate thesis: Some stochastic proximal Newton-type methods for solving large-scale nonsmooth convex optimization problems

TitleSome stochastic proximal Newton-type methods for solving large-scale nonsmooth convex optimization problems
Authors
Advisors
Advisor(s):Yuan, X
Issue Date2025
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Wang, Z. [王子濛]. (2025). Some stochastic proximal Newton-type methods for solving large-scale nonsmooth convex optimization problems. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractThis thesis addresses a class of large-scale nonsmooth optimization problems prevalent in machine learning, where the objective function comprises a convex and smooth func tion (expressed as the average of a large number of differentiable function components) and a convex but potentially nonsmooth regularizer. Traditional first-order and deterministic methods for solving such problems often suffer from slow convergence or high computational costs. To address these limitations, we propose stochastic proximal Newton-type methods that incorporate both gradient and Hessian information to generate more accurate search directions by (approximately) solving some subproblems. We first introduce a general algo rithmic framework for approximate stochastic proximal Newton-type methods, independent of specific gradient and Hessian approximation schemes. Within this framework, we estab lish global and local convergence guarantees based on expected error bounds for gradient and Hessian approximations, as well as the accuracy of subproblem resolutions. Building on this framework, we develop three practical algorithms with tailored gradient and Hessian approximation schemes and stopping criteria. The first algorithm is an inexact subsampled proximal Newton method, where gradient and Hessian approximations are computed via data subsampling. To improve computational efficiency, we introduce an adaptive stop ping criterion for subproblem resolutions that controls violations from first-order optimality conditions. We prove that this method achieves globally linear and locally superlinear con vergence with appropriately increasing sample sizes. The second algorithm is a single-loop stochastic proximal L-BFGS method, which accelerates convergence by incorporating vari ance reduction (VR) techniques. Assuming exact subproblem resolutions, we employ a loopless SVRG method for stochastic gradient estimation and a stochastic limited-memory BFGS (L-BFGS) scheme for Hessian approximations, resulting in a simple single-loop struc ture. We extend this paradigm to accommodate a broader class of VR methods (e.g., SAGA and SEGA) and establish globally linear convergence under mild assumptions. The third algorithm is an inexact stochastic proximal Newton-type method that uses the stochastic variance-reduced gradient (SVRG) technique for efficient gradient estimation. This method allows flexible Hessian approximations, requiring only uniform boundedness from above and below. We develop two practical stopping criteria for subproblem resolutions, tailored for smooth and nonsmooth objectives respectively. In both smooth and nonsmooth settings, we show that this method converges linearly for strongly convex objectives and sublinearly for nonconvex objectives under mild assumptions. Last but not least, we meticulously analyze the nonsmooth subproblems arising in these stochastic proximal Newton-type methods and propose a highly efficient and easily implementable semismooth Newton (SSN) solver. By leveraging a compact representation of the L-BFGS matrix and storing auxiliary matrices, we show that the arithmetic operations per iteration scale merely linearly with the problem dimension. The effectiveness and numerical efficiency of the proposed algorithms, along with the fast SSN solver, are validated through extensive numerical experiments, demonstrating superior performance compared to several state-of-the-art methods.
DegreeDoctor of Philosophy
SubjectNonsmooth optimization
Convex functions
Newton-Raphson method
Dept/ProgramMathematics
Persistent Identifierhttp://hdl.handle.net/10722/367488

 

DC FieldValueLanguage
dc.contributor.advisorYuan, X-
dc.contributor.authorWang, Zimeng-
dc.contributor.author王子濛-
dc.date.accessioned2025-12-11T06:42:25Z-
dc.date.available2025-12-11T06:42:25Z-
dc.date.issued2025-
dc.identifier.citationWang, Z. [王子濛]. (2025). Some stochastic proximal Newton-type methods for solving large-scale nonsmooth convex optimization problems. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/367488-
dc.description.abstractThis thesis addresses a class of large-scale nonsmooth optimization problems prevalent in machine learning, where the objective function comprises a convex and smooth func tion (expressed as the average of a large number of differentiable function components) and a convex but potentially nonsmooth regularizer. Traditional first-order and deterministic methods for solving such problems often suffer from slow convergence or high computational costs. To address these limitations, we propose stochastic proximal Newton-type methods that incorporate both gradient and Hessian information to generate more accurate search directions by (approximately) solving some subproblems. We first introduce a general algo rithmic framework for approximate stochastic proximal Newton-type methods, independent of specific gradient and Hessian approximation schemes. Within this framework, we estab lish global and local convergence guarantees based on expected error bounds for gradient and Hessian approximations, as well as the accuracy of subproblem resolutions. Building on this framework, we develop three practical algorithms with tailored gradient and Hessian approximation schemes and stopping criteria. The first algorithm is an inexact subsampled proximal Newton method, where gradient and Hessian approximations are computed via data subsampling. To improve computational efficiency, we introduce an adaptive stop ping criterion for subproblem resolutions that controls violations from first-order optimality conditions. We prove that this method achieves globally linear and locally superlinear con vergence with appropriately increasing sample sizes. The second algorithm is a single-loop stochastic proximal L-BFGS method, which accelerates convergence by incorporating vari ance reduction (VR) techniques. Assuming exact subproblem resolutions, we employ a loopless SVRG method for stochastic gradient estimation and a stochastic limited-memory BFGS (L-BFGS) scheme for Hessian approximations, resulting in a simple single-loop struc ture. We extend this paradigm to accommodate a broader class of VR methods (e.g., SAGA and SEGA) and establish globally linear convergence under mild assumptions. The third algorithm is an inexact stochastic proximal Newton-type method that uses the stochastic variance-reduced gradient (SVRG) technique for efficient gradient estimation. This method allows flexible Hessian approximations, requiring only uniform boundedness from above and below. We develop two practical stopping criteria for subproblem resolutions, tailored for smooth and nonsmooth objectives respectively. In both smooth and nonsmooth settings, we show that this method converges linearly for strongly convex objectives and sublinearly for nonconvex objectives under mild assumptions. Last but not least, we meticulously analyze the nonsmooth subproblems arising in these stochastic proximal Newton-type methods and propose a highly efficient and easily implementable semismooth Newton (SSN) solver. By leveraging a compact representation of the L-BFGS matrix and storing auxiliary matrices, we show that the arithmetic operations per iteration scale merely linearly with the problem dimension. The effectiveness and numerical efficiency of the proposed algorithms, along with the fast SSN solver, are validated through extensive numerical experiments, demonstrating superior performance compared to several state-of-the-art methods. -
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshNonsmooth optimization-
dc.subject.lcshConvex functions-
dc.subject.lcshNewton-Raphson method-
dc.titleSome stochastic proximal Newton-type methods for solving large-scale nonsmooth convex optimization problems-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineMathematics-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2025-
dc.identifier.mmsid991045147153203414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats