Adaptive Model Design for Markov Decision Process

Chen, Siyu; Yang, Donglin; Li, Jiayang; Wang, Senmiao; Yang, Zhuoran; Wang, Zhaoran

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Scopus: eid_2-s2.0-85163067834

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- Industrial & Manufacturing Systems Engineering: Conference papers

Conference Paper: Adaptive Model Design for Markov Decision Process

Title	Adaptive Model Design for Markov Decision Process
Authors	Chen, Siyu Yang, Donglin Li, Jiayang Wang, Senmiao Yang, Zhuoran Wang, Zhaoran
Issue Date	2022
Citation	Proceedings of Machine Learning Research, 2022, v. 162, p. 3679-3700 How to Cite?
Abstract	In a Markov decision process (MDP), an agent interacts with the environment via perceptions and actions. During this process, the agent aims to maximize its own gain. Hence, appropriate regulations are often required, if we hope to take the external costs/benefits of its actions into consideration. In this paper, we study how to regulate such an agent by redesigning model parameters that can affect the rewards and/or the transition kernels. We formulate this problem as a bilevel program, in which the lower-level MDP is regulated by the upper-level model designer. To solve the resulting problem, we develop a scheme that allows the designer to iteratively predict the agent's reaction by solving the MDP and then adaptively update model parameters based on the predicted reaction. The algorithm is first theoretically analyzed and then empirically tested on several MDP models arising in economics and robotics.
Persistent Identifier	http://hdl.handle.net/10722/351469

DC Field	Value	Language
dc.contributor.author	Chen, Siyu	-
dc.contributor.author	Yang, Donglin	-
dc.contributor.author	Li, Jiayang	-
dc.contributor.author	Wang, Senmiao	-
dc.contributor.author	Yang, Zhuoran	-
dc.contributor.author	Wang, Zhaoran	-
dc.date.accessioned	2024-11-20T03:56:28Z	-
dc.date.available	2024-11-20T03:56:28Z	-
dc.date.issued	2022	-
dc.identifier.citation	Proceedings of Machine Learning Research, 2022, v. 162, p. 3679-3700	-
dc.identifier.uri	http://hdl.handle.net/10722/351469	-
dc.description.abstract	In a Markov decision process (MDP), an agent interacts with the environment via perceptions and actions. During this process, the agent aims to maximize its own gain. Hence, appropriate regulations are often required, if we hope to take the external costs/benefits of its actions into consideration. In this paper, we study how to regulate such an agent by redesigning model parameters that can affect the rewards and/or the transition kernels. We formulate this problem as a bilevel program, in which the lower-level MDP is regulated by the upper-level model designer. To solve the resulting problem, we develop a scheme that allows the designer to iteratively predict the agent's reaction by solving the MDP and then adaptively update model parameters based on the predicted reaction. The algorithm is first theoretically analyzed and then empirically tested on several MDP models arising in economics and robotics.	-
dc.language	eng	-
dc.relation.ispartof	Proceedings of Machine Learning Research	-
dc.title	Adaptive Model Design for Markov Decision Process	-
dc.type	Conference_Paper	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.scopus	eid_2-s2.0-85163067834	-
dc.identifier.volume	162	-
dc.identifier.spage	3679	-
dc.identifier.epage	3700	-
dc.identifier.eissn	2640-3498	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Adaptive Model Design for Markov Decision Process

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats