File Download
Supplementary

postgraduate thesis: Neural 3D object reconstruction

TitleNeural 3D object reconstruction
Authors
Issue Date2024
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Liu, Y. [劉緣]. (2024). Neural 3D object reconstruction. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractObject reconstruction and recognition is an important task in various 3D applications such as AR/VR, autonomous driving, and robotics, which has been an active research field for decades. While this research achieved great improvements in recent years, it is still challenging to accurately and holistically reconstruct arbitrary objects from casually captured images. Moreover, downstream 3D tasks do not only require reconstructing the geometry of the object but also need to localize, render, or relight the object. In this thesis, we present a framework to holistically reconstruct objects from casually captured images. This framework starts from pose estimation of captured images and enables rendering novel views of the object, reconstructing both the geometry and the material of the object, and localizing the location and the orientation of the object in the new images. For accurate estimation of the relative camera poses between casually captured images, we exploit the motion coherence property of two images to filter out erroneous correspondences. We analyze the motion coherence property on the graph and integrate this graph-based analysis into a learning-based framework for correspondence pruning. Finally, we estimate relative camera poses from the retained reliable correspondences for various downstream tasks. Given the images with known poses, we study the problem of synthesizing novel-view images of objects. We propose a new ray-based representation, which constructs a neural radiance field on the fly for novel-view synthesis. To handle the artifacts brought by the self-occlusions of the object, we propose a visibility-aware rendering scheme to greatly improve the rendering quality of objects. With the posed images of an object, we further estimate the locations and orientations of this object in a new image of the same object in a new environment, which is called 6 6-degree-of-freedom (6DoF) pose estimation of the object. In comparison to the previous 6DoF pose estimator which requires tedious retraining for different objects or categories, our estimator can be applied to arbitrary objects without pertaining and only requires the posed images of the object. To reconstruct high-quality 3D meshes from pose images, we proposed a neural representation-based reconstruction to reconstruct the objects. This neural representation enables the accurate reconstruction of highly reflective objects by building a neural Signed Distance Field (SDF) from the multiview images. To achieve this goal, our method essentially combines the evaluation of the rendering equation with the volume rendering of the neural SDF, which also enables us to estimate the materials of the objects. Multiview geometry reconstruction often requires dense images captured on the object and these reconstruction methods fail when only one image is available. We further extend the diffusion model to generate multiview consistent images on novel viewpoints, which enables us to reconstruct the geometry of the object even from one image. By integrating all the above methods, we achieve a comprehensive 3D reconstruction framework that utilizes learning-based approaches and neural representations. This framework facilitates novel view synthesis, 6DoF pose estimation, and the estimation of object shape and material from casually captured images or single-view images. It significantly expands the applications of image-based object reconstruction and holds promise for various downstream tasks like robotics, AR/VR, and autonomous driving. These advancements demonstrate the capacity of learning-based techniques and neural representations to advance the frontier of 3D object reconstruction.
DegreeDoctor of Philosophy
SubjectImage reconstruction
Three-dimensional imaging
Dept/ProgramComputer Science
Persistent Identifierhttp://hdl.handle.net/10722/345422

 

DC FieldValueLanguage
dc.contributor.authorLiu, Yuan-
dc.contributor.author劉緣-
dc.date.accessioned2024-08-26T08:59:41Z-
dc.date.available2024-08-26T08:59:41Z-
dc.date.issued2024-
dc.identifier.citationLiu, Y. [劉緣]. (2024). Neural 3D object reconstruction. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/345422-
dc.description.abstractObject reconstruction and recognition is an important task in various 3D applications such as AR/VR, autonomous driving, and robotics, which has been an active research field for decades. While this research achieved great improvements in recent years, it is still challenging to accurately and holistically reconstruct arbitrary objects from casually captured images. Moreover, downstream 3D tasks do not only require reconstructing the geometry of the object but also need to localize, render, or relight the object. In this thesis, we present a framework to holistically reconstruct objects from casually captured images. This framework starts from pose estimation of captured images and enables rendering novel views of the object, reconstructing both the geometry and the material of the object, and localizing the location and the orientation of the object in the new images. For accurate estimation of the relative camera poses between casually captured images, we exploit the motion coherence property of two images to filter out erroneous correspondences. We analyze the motion coherence property on the graph and integrate this graph-based analysis into a learning-based framework for correspondence pruning. Finally, we estimate relative camera poses from the retained reliable correspondences for various downstream tasks. Given the images with known poses, we study the problem of synthesizing novel-view images of objects. We propose a new ray-based representation, which constructs a neural radiance field on the fly for novel-view synthesis. To handle the artifacts brought by the self-occlusions of the object, we propose a visibility-aware rendering scheme to greatly improve the rendering quality of objects. With the posed images of an object, we further estimate the locations and orientations of this object in a new image of the same object in a new environment, which is called 6 6-degree-of-freedom (6DoF) pose estimation of the object. In comparison to the previous 6DoF pose estimator which requires tedious retraining for different objects or categories, our estimator can be applied to arbitrary objects without pertaining and only requires the posed images of the object. To reconstruct high-quality 3D meshes from pose images, we proposed a neural representation-based reconstruction to reconstruct the objects. This neural representation enables the accurate reconstruction of highly reflective objects by building a neural Signed Distance Field (SDF) from the multiview images. To achieve this goal, our method essentially combines the evaluation of the rendering equation with the volume rendering of the neural SDF, which also enables us to estimate the materials of the objects. Multiview geometry reconstruction often requires dense images captured on the object and these reconstruction methods fail when only one image is available. We further extend the diffusion model to generate multiview consistent images on novel viewpoints, which enables us to reconstruct the geometry of the object even from one image. By integrating all the above methods, we achieve a comprehensive 3D reconstruction framework that utilizes learning-based approaches and neural representations. This framework facilitates novel view synthesis, 6DoF pose estimation, and the estimation of object shape and material from casually captured images or single-view images. It significantly expands the applications of image-based object reconstruction and holds promise for various downstream tasks like robotics, AR/VR, and autonomous driving. These advancements demonstrate the capacity of learning-based techniques and neural representations to advance the frontier of 3D object reconstruction.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshImage reconstruction-
dc.subject.lcshThree-dimensional imaging-
dc.titleNeural 3D object reconstruction-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineComputer Science-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2024-
dc.identifier.mmsid991044843667103414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats