Stochastic Approximation for Risk-Aware Markov Decision Processes

Huang, W; Haskell, WB

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/TAC.2020.2989702
Scopus: eid_2-s2.0-85102065067
WOS: WOS:000623420100033
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Industrial & Manufacturing Systems Engineering: Journal/Magazine Articles

Article: Stochastic Approximation for Risk-Aware Markov Decision Processes

Title	Stochastic Approximation for Risk-Aware Markov Decision Processes
Authors	Huang, W Haskell, WB
Keywords	Markov decision processes (MDPs) risk measure saddle point stochastic approximation Q-learning
Issue Date	2021
Publisher	Institute of Electrical and Electronics Engineers. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=9
Citation	IEEE Transactions on Automatic Control, 2021, v. 66 n. 3, p. 1314-1320 How to Cite? DOI: http://dx.doi.org/10.1109/TAC.2020.2989702
Abstract	We develop a stochastic approximation-type algorithm to solve finite state/action, infinite-horizon, risk-aware Markov decision processes. Our algorithm has two loops. The inner loop computes the risk by solving a stochastic saddle-point problem. The outer loop performs Q- learning to compute an optimal risk-aware policy. Several widely investigated risk measures (e.g., conditional value-at-risk, optimized certainty equivalent, and absolute semideviation) are covered by our algorithm. Almost sure convergence and the convergence rate of the algorithm are established. For an error tolerance ε > 0 for optimal Q-value estimation gap and learning rate k ∈ (1/2, 1], the overall convergence rate of our algorithm is Ω((ln(1/δε)/ε 2 ) 1/k + (ln(1/ε)) 1/(1-k) ) with probability at least 1 - δ.
Persistent Identifier	http://hdl.handle.net/10722/305821
ISSN	0018-9286 2023 Impact Factor: 6.2 2023 SCImago Journal Rankings: 4.501
ISI Accession Number ID	WOS:000623420100033

DC Field	Value	Language
dc.contributor.author	Huang, W	-
dc.contributor.author	Haskell, WB	-
dc.date.accessioned	2021-10-20T10:14:48Z	-
dc.date.available	2021-10-20T10:14:48Z	-
dc.date.issued	2021	-
dc.identifier.citation	IEEE Transactions on Automatic Control, 2021, v. 66 n. 3, p. 1314-1320	-
dc.identifier.issn	0018-9286	-
dc.identifier.uri	http://hdl.handle.net/10722/305821	-
dc.description.abstract	We develop a stochastic approximation-type algorithm to solve finite state/action, infinite-horizon, risk-aware Markov decision processes. Our algorithm has two loops. The inner loop computes the risk by solving a stochastic saddle-point problem. The outer loop performs Q- learning to compute an optimal risk-aware policy. Several widely investigated risk measures (e.g., conditional value-at-risk, optimized certainty equivalent, and absolute semideviation) are covered by our algorithm. Almost sure convergence and the convergence rate of the algorithm are established. For an error tolerance ε > 0 for optimal Q-value estimation gap and learning rate k ∈ (1/2, 1], the overall convergence rate of our algorithm is Ω((ln(1/δε)/ε 2 ) 1/k + (ln(1/ε)) 1/(1-k) ) with probability at least 1 - δ.	-
dc.language	eng	-
dc.publisher	Institute of Electrical and Electronics Engineers. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=9	-
dc.relation.ispartof	IEEE Transactions on Automatic Control	-
dc.subject	Markov decision processes (MDPs)	-
dc.subject	risk measure	-
dc.subject	saddle point	-
dc.subject	stochastic approximation	-
dc.subject	Q-learning	-
dc.title	Stochastic Approximation for Risk-Aware Markov Decision Processes	-
dc.type	Article	-
dc.identifier.email	Huang, W: huangwj@hku.hk	-
dc.identifier.authority	Huang, W=rp02898	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/TAC.2020.2989702	-
dc.identifier.scopus	eid_2-s2.0-85102065067	-
dc.identifier.hkuros	327215	-
dc.identifier.volume	66	-
dc.identifier.issue	3	-
dc.identifier.spage	1314	-
dc.identifier.epage	1320	-
dc.identifier.isi	WOS:000623420100033	-
dc.publisher.place	United States	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Stochastic Approximation for Risk-Aware Markov Decision Processes

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats