File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Hierarchical Adversarial Inverse Reinforcement Learning

TitleHierarchical Adversarial Inverse Reinforcement Learning
Authors
Keywordshierarchical imitation learning (HIL)
Inverse reinforcement learning (IRL)
robotic learning
Issue Date2024
Citation
IEEE Transactions on Neural Networks and Learning Systems, 2024, v. 35, n. 12, p. 17549-17558 How to Cite?
AbstractImitation learning (IL) has been proposed to recover the expert policy from demonstrations. However, it would be difficult to learn a single monolithic policy for highly complex long-horizon tasks of which the expert policy usually contains subtask hierarchies. Therefore, hierarchical IL (HIL) has been developed to learn a hierarchical policy from expert demonstrations through explicitly modeling the activity structure in a task with the option framework. Existing HIL methods either overlook the causal relationship between the subtask structure and the learned policy, or fail to learn the high-level and low-level policy in the hierarchical framework in conjuncture, which leads to suboptimality. In this work, we propose a novel HIL algorithm - hierarchical adversarial inverse reinforcement learning (H-AIRL), which extends a state-of-the-art (SOTA) IL algorithm - AIRL, with the one-step option framework. Specifically, we redefine the AIRL objectives on the extended state and action spaces, and further introduce a directed information term to the objective function to enhance the causality between the low-level policy and its corresponding subtask. Moreover, we propose an expectation-maximization (EM) adaption of our algorithm so that it can be applied to expert demonstrations without the subtask annotations which are more accessible in practice. Theoretical justifications of our algorithm design and evaluations on challenging robotic control tasks are provided to show the superiority of our algorithm compared with SOTA HIL baselines. The codes are available at https://github.com/LucasCJYSDL/HierAIRL.
Persistent Identifierhttp://hdl.handle.net/10722/361757
ISSN
2023 Impact Factor: 10.2
2023 SCImago Journal Rankings: 4.170

 

DC FieldValueLanguage
dc.contributor.authorChen, Jiayu-
dc.contributor.authorLan, Tian-
dc.contributor.authorAggarwal, Vaneet-
dc.date.accessioned2025-09-16T04:19:44Z-
dc.date.available2025-09-16T04:19:44Z-
dc.date.issued2024-
dc.identifier.citationIEEE Transactions on Neural Networks and Learning Systems, 2024, v. 35, n. 12, p. 17549-17558-
dc.identifier.issn2162-237X-
dc.identifier.urihttp://hdl.handle.net/10722/361757-
dc.description.abstractImitation learning (IL) has been proposed to recover the expert policy from demonstrations. However, it would be difficult to learn a single monolithic policy for highly complex long-horizon tasks of which the expert policy usually contains subtask hierarchies. Therefore, hierarchical IL (HIL) has been developed to learn a hierarchical policy from expert demonstrations through explicitly modeling the activity structure in a task with the option framework. Existing HIL methods either overlook the causal relationship between the subtask structure and the learned policy, or fail to learn the high-level and low-level policy in the hierarchical framework in conjuncture, which leads to suboptimality. In this work, we propose a novel HIL algorithm - hierarchical adversarial inverse reinforcement learning (H-AIRL), which extends a state-of-the-art (SOTA) IL algorithm - AIRL, with the one-step option framework. Specifically, we redefine the AIRL objectives on the extended state and action spaces, and further introduce a directed information term to the objective function to enhance the causality between the low-level policy and its corresponding subtask. Moreover, we propose an expectation-maximization (EM) adaption of our algorithm so that it can be applied to expert demonstrations without the subtask annotations which are more accessible in practice. Theoretical justifications of our algorithm design and evaluations on challenging robotic control tasks are provided to show the superiority of our algorithm compared with SOTA HIL baselines. The codes are available at https://github.com/LucasCJYSDL/HierAIRL.-
dc.languageeng-
dc.relation.ispartofIEEE Transactions on Neural Networks and Learning Systems-
dc.subjecthierarchical imitation learning (HIL)-
dc.subjectInverse reinforcement learning (IRL)-
dc.subjectrobotic learning-
dc.titleHierarchical Adversarial Inverse Reinforcement Learning-
dc.typeArticle-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1109/TNNLS.2023.3305983-
dc.identifier.pmid37703157-
dc.identifier.scopuseid_2-s2.0-85171752659-
dc.identifier.volume35-
dc.identifier.issue12-
dc.identifier.spage17549-
dc.identifier.epage17558-
dc.identifier.eissn2162-2388-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats