File Download
Supplementary

Conference Paper: SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation.

TitleSIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation.
Authors
Issue Date19-Oct-2025
Abstract

Simulating stylized human-scene interactions (HSI) in physical environments is a challenging yet fascinating task. Prior works emphasize long-term execution but fall short in achieving both diverse style and physical plausibility. To tackle this challenge, we introduce a novel hierarchical framework named SIMS that seamlessly bridges high level script-driven intent with a low-level control policy, enabling more expressive and diverse human-scene interactions. Specifically, we employ Large Language Models with Retrieval-Augmented Generation (RAG) to generate coherent and diverse long-form scripts, providing a rich foundation for motion planning. A versatile multicondition physics-based control policy is also developed, which leverages text embeddings from the generated scripts to encode stylistic cues, simultaneously perceiving environmental geometries and accomplishing task goals. By integrating the retrieval-augmented script generation with the multi-condition controller, our approach provides a unified solution for generating stylized HSI motions. We further introduce a comprehensive planning dataset produced by RAG and a stylized motion dataset featuring diverse locomotions and interactions. Extensive experiments demonstrate SIMS’s effectiveness in executing various tasks and generalizing across different scenarios, significantly outperforming previous methods. Project page: https://wenjiawang0312.github.io/projects/sims/.


Persistent Identifierhttp://hdl.handle.net/10722/362730

 

DC FieldValueLanguage
dc.contributor.authorWang, Wenjia-
dc.contributor.authorPan, Liang-
dc.contributor.authorDou, Zhiyang-
dc.contributor.authorMei, Jidong-
dc.contributor.authorLiao, Zhouyingcheng-
dc.contributor.authorLou, Yuke-
dc.contributor.authorWu, Yifan-
dc.contributor.authorLei, Yang-
dc.contributor.authorWang, Jingbo-
dc.contributor.authorKomura, Taku-
dc.date.accessioned2025-09-27T00:35:27Z-
dc.date.available2025-09-27T00:35:27Z-
dc.date.issued2025-10-19-
dc.identifier.urihttp://hdl.handle.net/10722/362730-
dc.description.abstract<p>Simulating stylized human-scene interactions (HSI) in physical environments is a challenging yet fascinating task. Prior works emphasize long-term execution but fall short in achieving both diverse style and physical plausibility. To tackle this challenge, we introduce a novel hierarchical framework named SIMS that seamlessly bridges high level script-driven intent with a low-level control policy, enabling more expressive and diverse human-scene interactions. Specifically, we employ Large Language Models with Retrieval-Augmented Generation (RAG) to generate coherent and diverse long-form scripts, providing a rich foundation for motion planning. A versatile multicondition physics-based control policy is also developed, which leverages text embeddings from the generated scripts to encode stylistic cues, simultaneously perceiving environmental geometries and accomplishing task goals. By integrating the retrieval-augmented script generation with the multi-condition controller, our approach provides a unified solution for generating stylized HSI motions. We further introduce a comprehensive planning dataset produced by RAG and a stylized motion dataset featuring diverse locomotions and interactions. Extensive experiments demonstrate SIMS’s effectiveness in executing various tasks and generalizing across different scenarios, significantly outperforming previous methods. Project page: https://wenjiawang0312.github.io/projects/sims/.<br></p>-
dc.languageeng-
dc.relation.ispartofInternational Conference on Computer Vision (ICCV) (19/10/2025-23/10/2025, Honolulu, Hawai'i)-
dc.titleSIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation.-
dc.typeConference_Paper-
dc.description.naturepreprint-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats