RoomDesigner: Encoding Anchor-latents for Style-consistent and Shape-compatible Indoor Scene Generation

Zhao, Yiqun; Zhao, Zibo; Li, Jing; Dong, Sixun; Gao, Shenghua

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/3DV62453.2024.00132
Scopus: eid_2-s2.0-85196763619

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- Computer Science: Conference papers

Conference Paper: RoomDesigner: Encoding Anchor-latents for Style-consistent and Shape-compatible Indoor Scene Generation

Title	RoomDesigner: Encoding Anchor-latents for Style-consistent and Shape-compatible Indoor Scene Generation
Authors	Zhao, Yiqun Zhao, Zibo Li, Jing Dong, Sixun Gao, Shenghua
Keywords	3D Scene Generation
Issue Date	2024
Citation	Proceedings - 2024 International Conference on 3D Vision, 3DV 2024, 2024, p. 1413-1423 How to Cite? DOI: http://dx.doi.org/10.1109/3DV62453.2024.00132
Abstract	Indoor scene generation aims at creating shape-compatible, style-consistent furniture arrangements within a spatially reasonable layout. However, most existing approaches primarily focus on generating plausible furniture layouts without incorporating specific details related to individual furniture. To address this limitation, we propose a two-stage model integrating shape priors into the indoor scene generation by encoding furniture as anchor latent representations. In the first stage, we employ discrete vector quantization to encode each piece of furniture as anchor-latents. Based on the anchor-latents representation, the shape and location information of furniture was characterized by a concatenation of location, size, orientation, class, and our anchor latent. In the second stage, we leverage a transformer model to predict indoor scenes configuration autoregressively. Thanks to the proposed anchor-latents representations, our generative model can synthesis furniture in diverse shapes and produce physically plausible arrangements with shape-compatible and style-consistent furniture. Furthermore, our method facilitates various human interaction applications, such as style-consistent scene completion, object mismatch correction, and controllable object-level editing. Experimental results on the 3D-Front dataset demonstrate that our approach can generate more consistent and compatible indoor scenes compared to existing methods, even without shape retrieval. Additionally, extensive ablation studies confirm the effectiveness of our design choices in the indoor scene generation model.
Persistent Identifier	http://hdl.handle.net/10722/345390

DC Field	Value	Language
dc.contributor.author	Zhao, Yiqun	-
dc.contributor.author	Zhao, Zibo	-
dc.contributor.author	Li, Jing	-
dc.contributor.author	Dong, Sixun	-
dc.contributor.author	Gao, Shenghua	-
dc.date.accessioned	2024-08-15T09:27:02Z	-
dc.date.available	2024-08-15T09:27:02Z	-
dc.date.issued	2024	-
dc.identifier.citation	Proceedings - 2024 International Conference on 3D Vision, 3DV 2024, 2024, p. 1413-1423	-
dc.identifier.uri	http://hdl.handle.net/10722/345390	-
dc.description.abstract	Indoor scene generation aims at creating shape-compatible, style-consistent furniture arrangements within a spatially reasonable layout. However, most existing approaches primarily focus on generating plausible furniture layouts without incorporating specific details related to individual furniture. To address this limitation, we propose a two-stage model integrating shape priors into the indoor scene generation by encoding furniture as anchor latent representations. In the first stage, we employ discrete vector quantization to encode each piece of furniture as anchor-latents. Based on the anchor-latents representation, the shape and location information of furniture was characterized by a concatenation of location, size, orientation, class, and our anchor latent. In the second stage, we leverage a transformer model to predict indoor scenes configuration autoregressively. Thanks to the proposed anchor-latents representations, our generative model can synthesis furniture in diverse shapes and produce physically plausible arrangements with shape-compatible and style-consistent furniture. Furthermore, our method facilitates various human interaction applications, such as style-consistent scene completion, object mismatch correction, and controllable object-level editing. Experimental results on the 3D-Front dataset demonstrate that our approach can generate more consistent and compatible indoor scenes compared to existing methods, even without shape retrieval. Additionally, extensive ablation studies confirm the effectiveness of our design choices in the indoor scene generation model.	-
dc.language	eng	-
dc.relation.ispartof	Proceedings - 2024 International Conference on 3D Vision, 3DV 2024	-
dc.subject	3D Scene Generation	-
dc.title	RoomDesigner: Encoding Anchor-latents for Style-consistent and Shape-compatible Indoor Scene Generation	-
dc.type	Conference_Paper	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/3DV62453.2024.00132	-
dc.identifier.scopus	eid_2-s2.0-85196763619	-
dc.identifier.spage	1413	-
dc.identifier.epage	1423	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: RoomDesigner: Encoding Anchor-latents for Style-consistent and Shape-compatible Indoor Scene Generation

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats