Efficient quasi-Newton methods and the applications

Chen, Huiming; 陈辉铭

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Electrical & Electronic Engineering: Theses

postgraduate thesis: Efficient quasi-Newton methods and the applications

Title	Efficient quasi-Newton methods and the applications
Authors	Chen, Huiming 陈辉铭
Advisors	Advisor(s):Lam, WH Chan, SC
Issue Date	2020
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Chen, H. [陈辉铭]. (2020). Efficient quasi-Newton methods and the applications. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Stochastic optimization has been widely applied in different fields. With the advancement of technology, nonlinear and nonconvex applications with uncertainty of samples have been widespread in the big data era. Therefore, the stochastic optimization problems have become increasingly complicated. Moreover, since the systems are time-varying in many applications, the implementation of incremental optimization is also significant. However, the fast time-varying may bring a serious issue of insufficient samples such that there is possibility of ill-conditioning. Hence, in this thesis, we are concerned with efficient implementation of stochastic optimization and incremental optimization to improve numerical stability for nonconvex problems in large-scale systems. In quasi-Newton methods, classical damped method can effectively maintain positive definiteness of the Hessian approximations. Subsequently, one may obtain a descent direction in nonconvex problems. However, as mentioned above, there exists possibility of ill-conditioning. Although the regularization technique may avoid this problem, it cannot be directly applied in nonconvex problems even with classical damped method. Thus, we propose a novel stochastic damped Broyden–Fletcher–Goldfarb–Shanno (BFGS) method for dealing with non-convexity and ill-conditioning simultaneously in large scale problems. Moreover, we incorporate the stochastic variance reduced gradient (SVRG) strategy to refine our proposed method. The numerical results have shown that our proposed methods have exhibited superior performances. Since distribued optimization has become popular in the big data era, we further extend our proposed Hessian approximation scheme to distributed optimization, for which alternative direction method of multipliers (ADMM) is a natural fit. While our proposed Hessian approximation scheme may be applied in general ADMM methods, they require each agent updates its variables at each iteration. This has limited the applications in asynchronous distributed optimization, where each agent works at its own pace. To address this issue, we propose a novel stochatic ADMM method as a framework of asynchronous distributed optimization. The main idea is to update the corresponding variables at least once periodically for each agent. This has improved the flexibility of classical stochastic ADMM method. Moreover, we incorporate SVRG strategy for reducing the variance of the stochastic gradient. Thus, it can accelerate the convergence speed of our proposed method. We further study the convergence properties and demonstrate the effectiveness of our proposed stochastic ADMM method in theoretical aspect. Next, we consider the incremental optimization. Specifically, we study the cubic regularized symmetric rank-1 (SR1) method (CuREG-SR1) that has been proposed recently. The main advantage of incorporating cubic regularization technique to SR1 is to alleviate the problem of indefinite Hessian apprixmation matrix in SR1. However, its study of convergence properties under the line search framework is limited. Hence, we provide a comprehensive study on the convergence properties, based on which, we propose a novel incremental CuREG-SR1 (ICuREG-SR1) algorithm that adapts SR1 and CuREG-SR1 to solving large scale problems efficiently. The basic idea is to update information progressively for objective function involving a sum of individual functions. The numerical results have shown that the proposed method offers a superior performance in terms of gradient magnitude.
Degree	Doctor of Philosophy
Subject	Mathematical optimization
Dept/Program	Electrical and Electronic Engineering
Persistent Identifier	http://hdl.handle.net/10722/286782

DC Field	Value	Language
dc.contributor.advisor	Lam, WH	-
dc.contributor.advisor	Chan, SC	-
dc.contributor.author	Chen, Huiming	-
dc.contributor.author	陈辉铭	-
dc.date.accessioned	2020-09-05T01:20:55Z	-
dc.date.available	2020-09-05T01:20:55Z	-
dc.date.issued	2020	-
dc.identifier.citation	Chen, H. [陈辉铭]. (2020). Efficient quasi-Newton methods and the applications. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/286782	-
dc.description.abstract	Stochastic optimization has been widely applied in different fields. With the advancement of technology, nonlinear and nonconvex applications with uncertainty of samples have been widespread in the big data era. Therefore, the stochastic optimization problems have become increasingly complicated. Moreover, since the systems are time-varying in many applications, the implementation of incremental optimization is also significant. However, the fast time-varying may bring a serious issue of insufficient samples such that there is possibility of ill-conditioning. Hence, in this thesis, we are concerned with efficient implementation of stochastic optimization and incremental optimization to improve numerical stability for nonconvex problems in large-scale systems. In quasi-Newton methods, classical damped method can effectively maintain positive definiteness of the Hessian approximations. Subsequently, one may obtain a descent direction in nonconvex problems. However, as mentioned above, there exists possibility of ill-conditioning. Although the regularization technique may avoid this problem, it cannot be directly applied in nonconvex problems even with classical damped method. Thus, we propose a novel stochastic damped Broyden–Fletcher–Goldfarb–Shanno (BFGS) method for dealing with non-convexity and ill-conditioning simultaneously in large scale problems. Moreover, we incorporate the stochastic variance reduced gradient (SVRG) strategy to refine our proposed method. The numerical results have shown that our proposed methods have exhibited superior performances. Since distribued optimization has become popular in the big data era, we further extend our proposed Hessian approximation scheme to distributed optimization, for which alternative direction method of multipliers (ADMM) is a natural fit. While our proposed Hessian approximation scheme may be applied in general ADMM methods, they require each agent updates its variables at each iteration. This has limited the applications in asynchronous distributed optimization, where each agent works at its own pace. To address this issue, we propose a novel stochatic ADMM method as a framework of asynchronous distributed optimization. The main idea is to update the corresponding variables at least once periodically for each agent. This has improved the flexibility of classical stochastic ADMM method. Moreover, we incorporate SVRG strategy for reducing the variance of the stochastic gradient. Thus, it can accelerate the convergence speed of our proposed method. We further study the convergence properties and demonstrate the effectiveness of our proposed stochastic ADMM method in theoretical aspect. Next, we consider the incremental optimization. Specifically, we study the cubic regularized symmetric rank-1 (SR1) method (CuREG-SR1) that has been proposed recently. The main advantage of incorporating cubic regularization technique to SR1 is to alleviate the problem of indefinite Hessian apprixmation matrix in SR1. However, its study of convergence properties under the line search framework is limited. Hence, we provide a comprehensive study on the convergence properties, based on which, we propose a novel incremental CuREG-SR1 (ICuREG-SR1) algorithm that adapts SR1 and CuREG-SR1 to solving large scale problems efficiently. The basic idea is to update information progressively for objective function involving a sum of individual functions. The numerical results have shown that the proposed method offers a superior performance in terms of gradient magnitude.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Mathematical optimization	-
dc.title	Efficient quasi-Newton methods and the applications	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Electrical and Electronic Engineering	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2020	-
dc.identifier.mmsid	991044268207703414	-

File Download

Supplementary

postgraduate thesis: Efficient quasi-Newton methods and the applications

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats