File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Efficient quasi-Newton methods and the applications
Title | Efficient quasi-Newton methods and the applications |
---|---|
Authors | |
Advisors | |
Issue Date | 2020 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Chen, H. [陈辉铭]. (2020). Efficient quasi-Newton methods and the applications. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Stochastic optimization has been widely applied in different fields. With the advancement of technology, nonlinear and nonconvex applications with uncertainty of samples have been widespread in the big data era. Therefore, the stochastic optimization problems have become increasingly complicated. Moreover, since the systems are time-varying in many applications, the implementation of incremental optimization is also significant. However, the fast time-varying may bring a serious issue of insufficient samples such that there is possibility of ill-conditioning. Hence, in this thesis, we are concerned with efficient implementation of stochastic optimization and incremental optimization to improve numerical stability for nonconvex problems in large-scale systems.
In quasi-Newton methods, classical damped method can effectively maintain positive definiteness of the Hessian approximations. Subsequently, one may obtain a descent direction in nonconvex problems. However, as mentioned above, there exists possibility of ill-conditioning. Although the regularization technique may avoid this problem, it cannot be directly applied in nonconvex problems even with classical damped method. Thus, we propose a novel stochastic damped Broyden–Fletcher–Goldfarb–Shanno (BFGS) method for dealing with non-convexity and ill-conditioning simultaneously in large scale problems. Moreover, we incorporate the stochastic variance reduced gradient (SVRG) strategy to refine our proposed method. The numerical results have shown that our proposed methods have exhibited superior performances. Since distribued optimization has become popular in the big data era, we further extend our proposed Hessian approximation scheme to distributed optimization, for which alternative direction method of multipliers (ADMM) is a natural fit. While our proposed Hessian approximation scheme may be applied in general ADMM methods, they require each agent updates its variables at each iteration. This has limited the applications in asynchronous distributed optimization, where each agent works at its own pace. To address this issue, we propose a novel stochatic ADMM method as a framework of asynchronous distributed optimization. The main idea is to update the corresponding variables at least once periodically for each agent. This has improved the flexibility of classical stochastic ADMM method. Moreover, we incorporate SVRG strategy for reducing the variance of the stochastic gradient. Thus, it can accelerate the convergence speed of our proposed method. We further study the convergence properties and demonstrate the effectiveness of our proposed stochastic ADMM method in theoretical aspect. Next, we consider the incremental optimization. Specifically, we study the cubic regularized symmetric rank-1 (SR1) method (CuREG-SR1) that has been proposed recently. The main advantage of incorporating cubic regularization technique to SR1 is to alleviate the problem of indefinite Hessian apprixmation matrix in SR1. However, its study of convergence properties under the line search framework is limited. Hence, we provide a comprehensive study on the convergence properties, based on which, we propose a novel incremental CuREG-SR1 (ICuREG-SR1) algorithm that adapts SR1 and CuREG-SR1 to solving large scale problems efficiently. The basic idea is to update information progressively for objective function involving a sum of individual functions. The numerical results have shown that the proposed method offers a superior performance in terms of gradient magnitude.
|
Degree | Doctor of Philosophy |
Subject | Mathematical optimization |
Dept/Program | Electrical and Electronic Engineering |
Persistent Identifier | http://hdl.handle.net/10722/286782 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Lam, WH | - |
dc.contributor.advisor | Chan, SC | - |
dc.contributor.author | Chen, Huiming | - |
dc.contributor.author | 陈辉铭 | - |
dc.date.accessioned | 2020-09-05T01:20:55Z | - |
dc.date.available | 2020-09-05T01:20:55Z | - |
dc.date.issued | 2020 | - |
dc.identifier.citation | Chen, H. [陈辉铭]. (2020). Efficient quasi-Newton methods and the applications. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/286782 | - |
dc.description.abstract | Stochastic optimization has been widely applied in different fields. With the advancement of technology, nonlinear and nonconvex applications with uncertainty of samples have been widespread in the big data era. Therefore, the stochastic optimization problems have become increasingly complicated. Moreover, since the systems are time-varying in many applications, the implementation of incremental optimization is also significant. However, the fast time-varying may bring a serious issue of insufficient samples such that there is possibility of ill-conditioning. Hence, in this thesis, we are concerned with efficient implementation of stochastic optimization and incremental optimization to improve numerical stability for nonconvex problems in large-scale systems. In quasi-Newton methods, classical damped method can effectively maintain positive definiteness of the Hessian approximations. Subsequently, one may obtain a descent direction in nonconvex problems. However, as mentioned above, there exists possibility of ill-conditioning. Although the regularization technique may avoid this problem, it cannot be directly applied in nonconvex problems even with classical damped method. Thus, we propose a novel stochastic damped Broyden–Fletcher–Goldfarb–Shanno (BFGS) method for dealing with non-convexity and ill-conditioning simultaneously in large scale problems. Moreover, we incorporate the stochastic variance reduced gradient (SVRG) strategy to refine our proposed method. The numerical results have shown that our proposed methods have exhibited superior performances. Since distribued optimization has become popular in the big data era, we further extend our proposed Hessian approximation scheme to distributed optimization, for which alternative direction method of multipliers (ADMM) is a natural fit. While our proposed Hessian approximation scheme may be applied in general ADMM methods, they require each agent updates its variables at each iteration. This has limited the applications in asynchronous distributed optimization, where each agent works at its own pace. To address this issue, we propose a novel stochatic ADMM method as a framework of asynchronous distributed optimization. The main idea is to update the corresponding variables at least once periodically for each agent. This has improved the flexibility of classical stochastic ADMM method. Moreover, we incorporate SVRG strategy for reducing the variance of the stochastic gradient. Thus, it can accelerate the convergence speed of our proposed method. We further study the convergence properties and demonstrate the effectiveness of our proposed stochastic ADMM method in theoretical aspect. Next, we consider the incremental optimization. Specifically, we study the cubic regularized symmetric rank-1 (SR1) method (CuREG-SR1) that has been proposed recently. The main advantage of incorporating cubic regularization technique to SR1 is to alleviate the problem of indefinite Hessian apprixmation matrix in SR1. However, its study of convergence properties under the line search framework is limited. Hence, we provide a comprehensive study on the convergence properties, based on which, we propose a novel incremental CuREG-SR1 (ICuREG-SR1) algorithm that adapts SR1 and CuREG-SR1 to solving large scale problems efficiently. The basic idea is to update information progressively for objective function involving a sum of individual functions. The numerical results have shown that the proposed method offers a superior performance in terms of gradient magnitude. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Mathematical optimization | - |
dc.title | Efficient quasi-Newton methods and the applications | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Electrical and Electronic Engineering | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2020 | - |
dc.identifier.mmsid | 991044268207703414 | - |