File Download
There are no files associated with this item.
Supplementary
-
Citations:
- Appears in Collections:
Conference Paper: Solving ergodic Markov decision processes and perfect information zero-sum stochastic games by variance reduced deflated value iteration
Title | Solving ergodic Markov decision processes and perfect information zero-sum stochastic games by variance reduced deflated value iteration |
---|---|
Authors | |
Keywords | Games Game theory Complexity theory Markov processes Heuristic algorithms |
Issue Date | 2019 |
Publisher | IEEE. |
Citation | The 58th IEEE Conference on Decision and Control (CDC), Nice, France, 11-13 December 2019. In 2019 IEEE 58th Conference on Decision and Control (CDC): 11-13 December 2019, p. 5963-5970 How to Cite? |
Abstract | Recently, Sidford, Wang, Wu and Ye (2018) developed an algorithm combining variance reduction techniques with value iteration to solve discounted Markov decision processes. This algorithm has a sublinear complexity when the discount factor is fixed. Here, we extend this approach to mean-payoff problems, including both Markov decision processes and perfect information zero-sum stochastic games. We obtain sublinear complexity bounds, assuming there is a distinguished state which is accessible from all initial states and for all policies. Our method is based on a reduction from the mean payoff problem to the discounted problem by a Doob h-transform, combined with a deflation technique. The complexity analysis of this algorithm uses at the same time the techniques developed by Sidford et al. in the discounted case and non-linear spectral theory techniques (Collatz-Wielandt characterization of the eigenvalue). |
Persistent Identifier | http://hdl.handle.net/10722/316993 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Akian, M | - |
dc.contributor.author | Gaubert, S | - |
dc.contributor.author | Qu, Z | - |
dc.contributor.author | Saadi, O | - |
dc.date.accessioned | 2022-09-16T07:26:55Z | - |
dc.date.available | 2022-09-16T07:26:55Z | - |
dc.date.issued | 2019 | - |
dc.identifier.citation | The 58th IEEE Conference on Decision and Control (CDC), Nice, France, 11-13 December 2019. In 2019 IEEE 58th Conference on Decision and Control (CDC): 11-13 December 2019, p. 5963-5970 | - |
dc.identifier.uri | http://hdl.handle.net/10722/316993 | - |
dc.description.abstract | Recently, Sidford, Wang, Wu and Ye (2018) developed an algorithm combining variance reduction techniques with value iteration to solve discounted Markov decision processes. This algorithm has a sublinear complexity when the discount factor is fixed. Here, we extend this approach to mean-payoff problems, including both Markov decision processes and perfect information zero-sum stochastic games. We obtain sublinear complexity bounds, assuming there is a distinguished state which is accessible from all initial states and for all policies. Our method is based on a reduction from the mean payoff problem to the discounted problem by a Doob h-transform, combined with a deflation technique. The complexity analysis of this algorithm uses at the same time the techniques developed by Sidford et al. in the discounted case and non-linear spectral theory techniques (Collatz-Wielandt characterization of the eigenvalue). | - |
dc.language | eng | - |
dc.publisher | IEEE. | - |
dc.relation.ispartof | 2019 IEEE 58th Conference on Decision and Control (CDC): 11-13 December 2019 | - |
dc.rights | 2019 IEEE 58th Conference on Decision and Control (CDC): 11-13 December 2019. Copyright © IEEE. | - |
dc.rights | ©20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. | - |
dc.subject | Games | - |
dc.subject | Game theory | - |
dc.subject | Complexity theory | - |
dc.subject | Markov processes | - |
dc.subject | Heuristic algorithms | - |
dc.title | Solving ergodic Markov decision processes and perfect information zero-sum stochastic games by variance reduced deflated value iteration | - |
dc.type | Conference_Paper | - |
dc.identifier.email | Qu, Z: zhengqu@hku.hk | - |
dc.identifier.authority | Qu, Z=rp02096 | - |
dc.identifier.doi | 10.1109/CDC40024.2019.9029885 | - |
dc.identifier.hkuros | 336428 | - |
dc.identifier.spage | 5963 | - |
dc.identifier.epage | 5970 | - |
dc.publisher.place | United States | - |