File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Adaptive Model Design for Markov Decision Process

TitleAdaptive Model Design for Markov Decision Process
Authors
Issue Date2022
Citation
Proceedings of Machine Learning Research, 2022, v. 162, p. 3679-3700 How to Cite?
AbstractIn a Markov decision process (MDP), an agent interacts with the environment via perceptions and actions. During this process, the agent aims to maximize its own gain. Hence, appropriate regulations are often required, if we hope to take the external costs/benefits of its actions into consideration. In this paper, we study how to regulate such an agent by redesigning model parameters that can affect the rewards and/or the transition kernels. We formulate this problem as a bilevel program, in which the lower-level MDP is regulated by the upper-level model designer. To solve the resulting problem, we develop a scheme that allows the designer to iteratively predict the agent's reaction by solving the MDP and then adaptively update model parameters based on the predicted reaction. The algorithm is first theoretically analyzed and then empirically tested on several MDP models arising in economics and robotics.
Persistent Identifierhttp://hdl.handle.net/10722/351469

 

DC FieldValueLanguage
dc.contributor.authorChen, Siyu-
dc.contributor.authorYang, Donglin-
dc.contributor.authorLi, Jiayang-
dc.contributor.authorWang, Senmiao-
dc.contributor.authorYang, Zhuoran-
dc.contributor.authorWang, Zhaoran-
dc.date.accessioned2024-11-20T03:56:28Z-
dc.date.available2024-11-20T03:56:28Z-
dc.date.issued2022-
dc.identifier.citationProceedings of Machine Learning Research, 2022, v. 162, p. 3679-3700-
dc.identifier.urihttp://hdl.handle.net/10722/351469-
dc.description.abstractIn a Markov decision process (MDP), an agent interacts with the environment via perceptions and actions. During this process, the agent aims to maximize its own gain. Hence, appropriate regulations are often required, if we hope to take the external costs/benefits of its actions into consideration. In this paper, we study how to regulate such an agent by redesigning model parameters that can affect the rewards and/or the transition kernels. We formulate this problem as a bilevel program, in which the lower-level MDP is regulated by the upper-level model designer. To solve the resulting problem, we develop a scheme that allows the designer to iteratively predict the agent's reaction by solving the MDP and then adaptively update model parameters based on the predicted reaction. The algorithm is first theoretically analyzed and then empirically tested on several MDP models arising in economics and robotics.-
dc.languageeng-
dc.relation.ispartofProceedings of Machine Learning Research-
dc.titleAdaptive Model Design for Markov Decision Process-
dc.typeConference_Paper-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.scopuseid_2-s2.0-85163067834-
dc.identifier.volume162-
dc.identifier.spage3679-
dc.identifier.epage3700-
dc.identifier.eissn2640-3498-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats