Solving ergodic Markov decision processes and perfect information zero-sum stochastic games by variance reduced deflated value iteration

Akian, M; Gaubert, S; Qu, Z; Saadi, O

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/CDC40024.2019.9029885

Supplementary

Citations:
Appears in Collections:
- Mathematics: Conference papers

Conference Paper: Solving ergodic Markov decision processes and perfect information zero-sum stochastic games by variance reduced deflated value iteration

Title	Solving ergodic Markov decision processes and perfect information zero-sum stochastic games by variance reduced deflated value iteration
Authors	Akian, M Gaubert, S Qu, Z Saadi, O
Keywords	Games Game theory Complexity theory Markov processes Heuristic algorithms
Issue Date	2019
Publisher	IEEE.
Citation	The 58th IEEE Conference on Decision and Control (CDC), Nice, France, 11-13 December 2019. In 2019 IEEE 58th Conference on Decision and Control (CDC): 11-13 December 2019, p. 5963-5970 How to Cite? DOI: http://dx.doi.org/10.1109/CDC40024.2019.9029885
Abstract	Recently, Sidford, Wang, Wu and Ye (2018) developed an algorithm combining variance reduction techniques with value iteration to solve discounted Markov decision processes. This algorithm has a sublinear complexity when the discount factor is fixed. Here, we extend this approach to mean-payoff problems, including both Markov decision processes and perfect information zero-sum stochastic games. We obtain sublinear complexity bounds, assuming there is a distinguished state which is accessible from all initial states and for all policies. Our method is based on a reduction from the mean payoff problem to the discounted problem by a Doob h-transform, combined with a deflation technique. The complexity analysis of this algorithm uses at the same time the techniques developed by Sidford et al. in the discounted case and non-linear spectral theory techniques (Collatz-Wielandt characterization of the eigenvalue).
Persistent Identifier	http://hdl.handle.net/10722/316993

DC Field	Value	Language
dc.contributor.author	Akian, M	-
dc.contributor.author	Gaubert, S	-
dc.contributor.author	Qu, Z	-
dc.contributor.author	Saadi, O	-
dc.date.accessioned	2022-09-16T07:26:55Z	-
dc.date.available	2022-09-16T07:26:55Z	-
dc.date.issued	2019	-
dc.identifier.citation	The 58th IEEE Conference on Decision and Control (CDC), Nice, France, 11-13 December 2019. In 2019 IEEE 58th Conference on Decision and Control (CDC): 11-13 December 2019, p. 5963-5970	-
dc.identifier.uri	http://hdl.handle.net/10722/316993	-
dc.description.abstract	Recently, Sidford, Wang, Wu and Ye (2018) developed an algorithm combining variance reduction techniques with value iteration to solve discounted Markov decision processes. This algorithm has a sublinear complexity when the discount factor is fixed. Here, we extend this approach to mean-payoff problems, including both Markov decision processes and perfect information zero-sum stochastic games. We obtain sublinear complexity bounds, assuming there is a distinguished state which is accessible from all initial states and for all policies. Our method is based on a reduction from the mean payoff problem to the discounted problem by a Doob h-transform, combined with a deflation technique. The complexity analysis of this algorithm uses at the same time the techniques developed by Sidford et al. in the discounted case and non-linear spectral theory techniques (Collatz-Wielandt characterization of the eigenvalue).	-
dc.language	eng	-
dc.publisher	IEEE.	-
dc.relation.ispartof	2019 IEEE 58th Conference on Decision and Control (CDC): 11-13 December 2019	-
dc.rights	2019 IEEE 58th Conference on Decision and Control (CDC): 11-13 December 2019. Copyright © IEEE.	-
dc.rights	©20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.	-
dc.subject	Games	-
dc.subject	Game theory	-
dc.subject	Complexity theory	-
dc.subject	Markov processes	-
dc.subject	Heuristic algorithms	-
dc.title	Solving ergodic Markov decision processes and perfect information zero-sum stochastic games by variance reduced deflated value iteration	-
dc.type	Conference_Paper	-
dc.identifier.email	Qu, Z: zhengqu@hku.hk	-
dc.identifier.authority	Qu, Z=rp02096	-
dc.identifier.doi	10.1109/CDC40024.2019.9029885	-
dc.identifier.hkuros	336428	-
dc.identifier.spage	5963	-
dc.identifier.epage	5970	-
dc.publisher.place	United States	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Solving ergodic Markov decision processes and perfect information zero-sum stochastic games by variance reduced deflated value iteration

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats