Anisotropic Convolutional Neural Networks for RGB-D based Semantic Scene Completion

Li, Jie; Wang, Peng; Han, Kai; Liu, Yu

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/TPAMI.2021.3081499
Scopus: eid_2-s2.0-85106746781
WOS: WOS:000864325900061
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Statistics & Actuarial Science: Journal/Magazine Articles

Article: Anisotropic Convolutional Neural Networks for RGB-D based Semantic Scene Completion

Title	Anisotropic Convolutional Neural Networks for RGB-D based Semantic Scene Completion
Authors	Li, Jie Wang, Peng Han, Kai Liu, Yu
Keywords	3D scene understanding anisotropic convolution Context modeling Convolution dimensional decomposition convolution Kernel semantic scene completion Semantics Solid modeling Task analysis Three-dimensional displays
Issue Date	2021
Citation	IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021 How to Cite? DOI: http://dx.doi.org/10.1109/TPAMI.2021.3081499
Abstract	Semantic Scene Completion (SSC) is a computer vision task aiming to simultaneously infer the occupancy and semantic labels for each voxel in a scene from partial information consisting of a depth image and/or a RGB image. As a voxel-wise labeling task, the key for SSC is how to effectively model the visual and geometrical variations to complete the scene. To this end, we propose the Anisotropic Network, with novel convolutional modules that can model varying anisotropic receptive fields voxel-wisely in a computationally efficient manner. The basic idea to achieve such anisotropy is to decompose 3D convolution into consecutive dimensional convolutions, and determine the dimension-wise kernels on the fly. One module, termed kernel-selection anisotropic convolution, adaptively selects the optimal kernel for each dimensional convolution from a set of candidate kernels, and the other module, termed kernel-modulation anisotropic convolution, modulates a single kernel for each dimension to derive more flexible receptive field. By stacking multiple such modules, the 3D context modeling capability and flexibility can be further enhanced. Moreover, we present a new end-to-end trainable framework to approach the SSC task avoiding the expensive TSDF pre-processing as in existing methods. Extensive experiments on SSC benchmarks show the advantage of the proposed methods.
Persistent Identifier	http://hdl.handle.net/10722/311518
ISSN	0162-8828 2023 Impact Factor: 20.8 2023 SCImago Journal Rankings: 6.158
ISI Accession Number ID	WOS:000864325900061

DC Field	Value	Language
dc.contributor.author	Li, Jie	-
dc.contributor.author	Wang, Peng	-
dc.contributor.author	Han, Kai	-
dc.contributor.author	Liu, Yu	-
dc.date.accessioned	2022-03-22T11:54:08Z	-
dc.date.available	2022-03-22T11:54:08Z	-
dc.date.issued	2021	-
dc.identifier.citation	IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021	-
dc.identifier.issn	0162-8828	-
dc.identifier.uri	http://hdl.handle.net/10722/311518	-
dc.description.abstract	Semantic Scene Completion (SSC) is a computer vision task aiming to simultaneously infer the occupancy and semantic labels for each voxel in a scene from partial information consisting of a depth image and/or a RGB image. As a voxel-wise labeling task, the key for SSC is how to effectively model the visual and geometrical variations to complete the scene. To this end, we propose the Anisotropic Network, with novel convolutional modules that can model varying anisotropic receptive fields voxel-wisely in a computationally efficient manner. The basic idea to achieve such anisotropy is to decompose 3D convolution into consecutive dimensional convolutions, and determine the dimension-wise kernels on the fly. One module, termed kernel-selection anisotropic convolution, adaptively selects the optimal kernel for each dimensional convolution from a set of candidate kernels, and the other module, termed kernel-modulation anisotropic convolution, modulates a single kernel for each dimension to derive more flexible receptive field. By stacking multiple such modules, the 3D context modeling capability and flexibility can be further enhanced. Moreover, we present a new end-to-end trainable framework to approach the SSC task avoiding the expensive TSDF pre-processing as in existing methods. Extensive experiments on SSC benchmarks show the advantage of the proposed methods.	-
dc.language	eng	-
dc.relation.ispartof	IEEE Transactions on Pattern Analysis and Machine Intelligence	-
dc.subject	3D scene understanding	-
dc.subject	anisotropic convolution	-
dc.subject	Context modeling	-
dc.subject	Convolution	-
dc.subject	dimensional decomposition convolution	-
dc.subject	Kernel	-
dc.subject	semantic scene completion	-
dc.subject	Semantics	-
dc.subject	Solid modeling	-
dc.subject	Task analysis	-
dc.subject	Three-dimensional displays	-
dc.title	Anisotropic Convolutional Neural Networks for RGB-D based Semantic Scene Completion	-
dc.type	Article	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/TPAMI.2021.3081499	-
dc.identifier.scopus	eid_2-s2.0-85106746781	-
dc.identifier.eissn	1939-3539	-
dc.identifier.isi	WOS:000864325900061	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Anisotropic Convolutional Neural Networks for RGB-D based Semantic Scene Completion

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats