Neural 3D object reconstruction

Liu, Yuan; 劉緣

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Computer Science: Theses

postgraduate thesis: Neural 3D object reconstruction

Title	Neural 3D object reconstruction
Authors	Liu, Yuan 劉緣
Issue Date	2024
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Liu, Y. [劉緣]. (2024). Neural 3D object reconstruction. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Object reconstruction and recognition is an important task in various 3D applications such as AR/VR, autonomous driving, and robotics, which has been an active research field for decades. While this research achieved great improvements in recent years, it is still challenging to accurately and holistically reconstruct arbitrary objects from casually captured images. Moreover, downstream 3D tasks do not only require reconstructing the geometry of the object but also need to localize, render, or relight the object. In this thesis, we present a framework to holistically reconstruct objects from casually captured images. This framework starts from pose estimation of captured images and enables rendering novel views of the object, reconstructing both the geometry and the material of the object, and localizing the location and the orientation of the object in the new images. For accurate estimation of the relative camera poses between casually captured images, we exploit the motion coherence property of two images to filter out erroneous correspondences. We analyze the motion coherence property on the graph and integrate this graph-based analysis into a learning-based framework for correspondence pruning. Finally, we estimate relative camera poses from the retained reliable correspondences for various downstream tasks. Given the images with known poses, we study the problem of synthesizing novel-view images of objects. We propose a new ray-based representation, which constructs a neural radiance field on the fly for novel-view synthesis. To handle the artifacts brought by the self-occlusions of the object, we propose a visibility-aware rendering scheme to greatly improve the rendering quality of objects. With the posed images of an object, we further estimate the locations and orientations of this object in a new image of the same object in a new environment, which is called 6 6-degree-of-freedom (6DoF) pose estimation of the object. In comparison to the previous 6DoF pose estimator which requires tedious retraining for different objects or categories, our estimator can be applied to arbitrary objects without pertaining and only requires the posed images of the object. To reconstruct high-quality 3D meshes from pose images, we proposed a neural representation-based reconstruction to reconstruct the objects. This neural representation enables the accurate reconstruction of highly reflective objects by building a neural Signed Distance Field (SDF) from the multiview images. To achieve this goal, our method essentially combines the evaluation of the rendering equation with the volume rendering of the neural SDF, which also enables us to estimate the materials of the objects. Multiview geometry reconstruction often requires dense images captured on the object and these reconstruction methods fail when only one image is available. We further extend the diffusion model to generate multiview consistent images on novel viewpoints, which enables us to reconstruct the geometry of the object even from one image. By integrating all the above methods, we achieve a comprehensive 3D reconstruction framework that utilizes learning-based approaches and neural representations. This framework facilitates novel view synthesis, 6DoF pose estimation, and the estimation of object shape and material from casually captured images or single-view images. It significantly expands the applications of image-based object reconstruction and holds promise for various downstream tasks like robotics, AR/VR, and autonomous driving. These advancements demonstrate the capacity of learning-based techniques and neural representations to advance the frontier of 3D object reconstruction.
Degree	Doctor of Philosophy
Subject	Image reconstruction Three-dimensional imaging
Dept/Program	Computer Science
Persistent Identifier	http://hdl.handle.net/10722/345422

DC Field	Value	Language
dc.contributor.author	Liu, Yuan	-
dc.contributor.author	劉緣	-
dc.date.accessioned	2024-08-26T08:59:41Z	-
dc.date.available	2024-08-26T08:59:41Z	-
dc.date.issued	2024	-
dc.identifier.citation	Liu, Y. [劉緣]. (2024). Neural 3D object reconstruction. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/345422	-
dc.description.abstract	Object reconstruction and recognition is an important task in various 3D applications such as AR/VR, autonomous driving, and robotics, which has been an active research field for decades. While this research achieved great improvements in recent years, it is still challenging to accurately and holistically reconstruct arbitrary objects from casually captured images. Moreover, downstream 3D tasks do not only require reconstructing the geometry of the object but also need to localize, render, or relight the object. In this thesis, we present a framework to holistically reconstruct objects from casually captured images. This framework starts from pose estimation of captured images and enables rendering novel views of the object, reconstructing both the geometry and the material of the object, and localizing the location and the orientation of the object in the new images. For accurate estimation of the relative camera poses between casually captured images, we exploit the motion coherence property of two images to filter out erroneous correspondences. We analyze the motion coherence property on the graph and integrate this graph-based analysis into a learning-based framework for correspondence pruning. Finally, we estimate relative camera poses from the retained reliable correspondences for various downstream tasks. Given the images with known poses, we study the problem of synthesizing novel-view images of objects. We propose a new ray-based representation, which constructs a neural radiance field on the fly for novel-view synthesis. To handle the artifacts brought by the self-occlusions of the object, we propose a visibility-aware rendering scheme to greatly improve the rendering quality of objects. With the posed images of an object, we further estimate the locations and orientations of this object in a new image of the same object in a new environment, which is called 6 6-degree-of-freedom (6DoF) pose estimation of the object. In comparison to the previous 6DoF pose estimator which requires tedious retraining for different objects or categories, our estimator can be applied to arbitrary objects without pertaining and only requires the posed images of the object. To reconstruct high-quality 3D meshes from pose images, we proposed a neural representation-based reconstruction to reconstruct the objects. This neural representation enables the accurate reconstruction of highly reflective objects by building a neural Signed Distance Field (SDF) from the multiview images. To achieve this goal, our method essentially combines the evaluation of the rendering equation with the volume rendering of the neural SDF, which also enables us to estimate the materials of the objects. Multiview geometry reconstruction often requires dense images captured on the object and these reconstruction methods fail when only one image is available. We further extend the diffusion model to generate multiview consistent images on novel viewpoints, which enables us to reconstruct the geometry of the object even from one image. By integrating all the above methods, we achieve a comprehensive 3D reconstruction framework that utilizes learning-based approaches and neural representations. This framework facilitates novel view synthesis, 6DoF pose estimation, and the estimation of object shape and material from casually captured images or single-view images. It significantly expands the applications of image-based object reconstruction and holds promise for various downstream tasks like robotics, AR/VR, and autonomous driving. These advancements demonstrate the capacity of learning-based techniques and neural representations to advance the frontier of 3D object reconstruction.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Image reconstruction	-
dc.subject.lcsh	Three-dimensional imaging	-
dc.title	Neural 3D object reconstruction	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Computer Science	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2024	-
dc.identifier.mmsid	991044843667103414	-

File Download

Supplementary

postgraduate thesis: Neural 3D object reconstruction

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats