File Download
There are no files associated with this item.
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Conference Paper: Policy-Based Primal-Dual Methods for Convex Constrained Markov Decision Processes
Title | Policy-Based Primal-Dual Methods for Convex Constrained Markov Decision Processes |
---|---|
Authors | |
Issue Date | 7-Feb-2023 |
Publisher | AAAI Press |
Abstract | We study convex Constrained Markov Decision Processes (CMDPs) in which the objective is concave and the constraints are convex in the state-action occupancy measure. We propose a policy-based primal-dual algorithm that updates the primal variable via policy gradient ascent and updates the dual variable via projected sub-gradient descent. Despite the loss of additivity structure and the nonconvex nature, we establish the global convergence of the proposed algorithm by leveraging a hidden convexity in the problem, and prove the O (T-1/3) convergence rate in terms of both optimality gap and constraint violation. When the objective is strongly concave in the occupancy measure, we prove an improved convergence rate of O (T-1/2). By introducing a pessimistic term to the constraint, we further show that a zero constraint violation can be achieved while preserving the same convergence rate for the optimality gap. This work is the first one in the literature that establishes non-asymptotic convergence guarantees for policy-based primal-dual methods for solving infinite-horizon discounted convex CMDPs. |
Persistent Identifier | http://hdl.handle.net/10722/336533 |
ISBN |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Ying, D | - |
dc.contributor.author | Guo, MA | - |
dc.contributor.author | Ding, Y | - |
dc.contributor.author | Lavaei, J | - |
dc.contributor.author | Shen, ZJ | - |
dc.date.accessioned | 2024-02-16T03:57:32Z | - |
dc.date.available | 2024-02-16T03:57:32Z | - |
dc.date.issued | 2023-02-07 | - |
dc.identifier.isbn | 9781577358800 | - |
dc.identifier.uri | http://hdl.handle.net/10722/336533 | - |
dc.description.abstract | <p>We study convex Constrained Markov Decision Processes (CMDPs) in which the objective is concave and the constraints are convex in the state-action occupancy measure. We propose a policy-based primal-dual algorithm that updates the primal variable via policy gradient ascent and updates the dual variable via projected sub-gradient descent. Despite the loss of additivity structure and the nonconvex nature, we establish the global convergence of the proposed algorithm by leveraging a hidden convexity in the problem, and prove the O (T-1/3) convergence rate in terms of both optimality gap and constraint violation. When the objective is strongly concave in the occupancy measure, we prove an improved convergence rate of O (T-1/2). By introducing a pessimistic term to the constraint, we further show that a zero constraint violation can be achieved while preserving the same convergence rate for the optimality gap. This work is the first one in the literature that establishes non-asymptotic convergence guarantees for policy-based primal-dual methods for solving infinite-horizon discounted convex CMDPs.</p> | - |
dc.language | eng | - |
dc.publisher | AAAI Press | - |
dc.relation.ispartof | The 37th AAAI Conference on Artificial Intelligence, AAAI 2023 (07/02/2023-14/02/2023, , , Washington, USA) | - |
dc.title | Policy-Based Primal-Dual Methods for Convex Constrained Markov Decision Processes | - |
dc.type | Conference_Paper | - |
dc.identifier.scopus | eid_2-s2.0-85168250561 | - |
dc.identifier.volume | 37 | - |
dc.identifier.issue | 37 | - |
dc.identifier.spage | 10963 | - |
dc.identifier.epage | 10971 | - |