SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution

Liang, Zhixuan; Mu, Yao; Ma, Hengbo; Tomizuka, Masayoshi; Ding, Mingyu; Luo, Ping

File Download

There are no files associated with this item.

Supplementary

Citations:
Appears in Collections:
- Computer Science: Conference papers
- HKU Musketeers Foundation Institute of Data Science: Conference papers

Conference Paper: SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution

Title	SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution
Authors	Liang, Zhixuan Mu, Yao Ma, Hengbo Tomizuka, Masayoshi Ding, Mingyu Luo, Ping
Issue Date	17-Jun-2024
Abstract	Diffusion models have demonstrated strong potential for robotic trajectory planning. However, generating coherent trajectories from high-level instructions remains challenging, especially for long-range composition tasks requiring multiple sequential skills. We propose SkillDiffuser, an end-to-end hierarchical planning framework integrating interpretable skill learning with conditional diffusion planning to address this problem. At the higher level, the skill abstraction module learns discrete, human-understandable skill representations from visual observations and language instructions. These learned skill embeddings are then used to condition the diffusion model to generate customized latent trajectories aligned with the skills. This allows generating diverse state trajectories that adhere to the learnable skills. By integrating skill learning with conditional trajectory generation, SkillDiffuser produces coherent behavior following abstract instructions across diverse tasks. Experiments on multi-task robotic manipulation benchmarks like Meta-World and LOReL demonstrate state-of-the-art performance and human-interpretable skill representations from SkillDiffuser. More visualization results and information could be found on our website.
Persistent Identifier	http://hdl.handle.net/10722/347564

DC Field	Value	Language
dc.contributor.author	Liang, Zhixuan	-
dc.contributor.author	Mu, Yao	-
dc.contributor.author	Ma, Hengbo	-
dc.contributor.author	Tomizuka, Masayoshi	-
dc.contributor.author	Ding, Mingyu	-
dc.contributor.author	Luo, Ping	-
dc.date.accessioned	2024-09-25T00:30:46Z	-
dc.date.available	2024-09-25T00:30:46Z	-
dc.date.issued	2024-06-17	-
dc.identifier.uri	http://hdl.handle.net/10722/347564	-
dc.description.abstract	<p>Diffusion models have demonstrated strong potential for robotic trajectory planning. However, generating coherent trajectories from high-level instructions remains challenging, especially for long-range composition tasks requiring multiple sequential skills. We propose SkillDiffuser, an end-to-end hierarchical planning framework integrating interpretable skill learning with conditional diffusion planning to address this problem. At the higher level, the skill abstraction module learns discrete, human-understandable skill representations from visual observations and language instructions. These learned skill embeddings are then used to condition the diffusion model to generate customized latent trajectories aligned with the skills. This allows generating diverse state trajectories that adhere to the learnable skills. By integrating skill learning with conditional trajectory generation, SkillDiffuser produces coherent behavior following abstract instructions across diverse tasks. Experiments on multi-task robotic manipulation benchmarks like Meta-World and LOReL demonstrate state-of-the-art performance and human-interpretable skill representations from SkillDiffuser. More visualization results and information could be found on our website.<br></p>	-
dc.language	eng	-
dc.relation.ispartof	2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (17/06/2024-21/06/2024, Seattle)	-
dc.title	SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution	-
dc.type	Conference_Paper	-

File Download

Supplementary

Conference Paper: SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats