File Download
Supplementary

postgraduate thesis: Learning-based 3D depth estimation and surface reconstruction from 2D images

TitleLearning-based 3D depth estimation and surface reconstruction from 2D images
Authors
Advisors
Issue Date2023
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Long, X. [龙霄潇]. (2023). Learning-based 3D depth estimation and surface reconstruction from 2D images. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractThe tremendous advancements in 3D applications, such as AR/VR, video games, autonomous driving, and robotics, have fueled extensive research in the field of 3D reconstruction from 2D images. While existing reconstruction methods have achieved impressive results, there is still room for improvement, particularly in challenging scenarios. These include reconstructing surfaces with weak textures, using a limited set of images as input, and modeling objects with open boundaries. In this thesis, our focus is on harnessing learning-based techniques to address these challenges. We observe that surfaces with weak textures are often planar and human-made, such as walls, floors, and tables. To improve the quality of depth estimation for such surfaces, we propose enforcing a surface normal constraint in learning-based methods. By incorporating the surface normal constraint, we significantly enhance the geometric accuracy of the estimated depth. Additionally, we introduce a geometry-aware surface normal calculation method that adaptively determines reliable local geometry to approximate the surface normal, especially at regions with geometric variations. When dealing with temporally coherent input images, such as frames of a video, we exploit the temporal information among the frames to enhance depth estimation. We propose an epipolar spatio-temporal transformer that explicitly incorporates the temporal information based on multi-view epipolar geometry. This approach leads to more accurate and temporally consistent depth maps. In scenarios where only a limited set of images is available, existing reconstruction approaches often yield incomplete or distorted results. To overcome this limitation, we propose a neural rendering-based method that learns generalizable priors from the input images for generic geometry reasoning. These learned priors enable our method to reconstruct high-quality results with limited images. We also introduce a consistency-aware fine-tuning scheme, which enhances reconstruction details with low computational and time costs. Furthermore, while recent neural rendering-based reconstruction methods have achieved impressive outcomes, they are typically limited to objects with closed surfaces, using Signed Distance Functions (SDF) as the surface representation. To reconstruct surfaces with arbitrary topologies from 2D images, we propose representing surfaces as Unsigned Distance Functions (UDF) and develop a novel volume rendering scheme to learn the neural UDF representation. Our method enables high-quality reconstruction of non-closed shapes with complex topologies while achieving comparable performance to SDF-based methods for closed surfaces. Through these proposed techniques, we aim to advance the field of 3D reconstruction by leveraging learning-based approaches to overcome challenges related to weak textures, limited image sets, and open boundaries. The outcomes of this research have the potential to enhance the quality and fidelity of 3D reconstructions, contributing to the development of various 3D applications and pushing the boundaries of what is achievable in reconstructing the 3D world from 2D images.
DegreeDoctor of Philosophy
SubjectImage processing - Digital techniques
Three-dimensional imaging
Dept/ProgramComputer Science
Persistent Identifierhttp://hdl.handle.net/10722/330274

 

DC FieldValueLanguage
dc.contributor.advisorKomura, T-
dc.contributor.advisorWang, WP-
dc.contributor.authorLong, Xiaoxiao-
dc.contributor.author龙霄潇-
dc.date.accessioned2023-08-31T09:18:24Z-
dc.date.available2023-08-31T09:18:24Z-
dc.date.issued2023-
dc.identifier.citationLong, X. [龙霄潇]. (2023). Learning-based 3D depth estimation and surface reconstruction from 2D images. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/330274-
dc.description.abstractThe tremendous advancements in 3D applications, such as AR/VR, video games, autonomous driving, and robotics, have fueled extensive research in the field of 3D reconstruction from 2D images. While existing reconstruction methods have achieved impressive results, there is still room for improvement, particularly in challenging scenarios. These include reconstructing surfaces with weak textures, using a limited set of images as input, and modeling objects with open boundaries. In this thesis, our focus is on harnessing learning-based techniques to address these challenges. We observe that surfaces with weak textures are often planar and human-made, such as walls, floors, and tables. To improve the quality of depth estimation for such surfaces, we propose enforcing a surface normal constraint in learning-based methods. By incorporating the surface normal constraint, we significantly enhance the geometric accuracy of the estimated depth. Additionally, we introduce a geometry-aware surface normal calculation method that adaptively determines reliable local geometry to approximate the surface normal, especially at regions with geometric variations. When dealing with temporally coherent input images, such as frames of a video, we exploit the temporal information among the frames to enhance depth estimation. We propose an epipolar spatio-temporal transformer that explicitly incorporates the temporal information based on multi-view epipolar geometry. This approach leads to more accurate and temporally consistent depth maps. In scenarios where only a limited set of images is available, existing reconstruction approaches often yield incomplete or distorted results. To overcome this limitation, we propose a neural rendering-based method that learns generalizable priors from the input images for generic geometry reasoning. These learned priors enable our method to reconstruct high-quality results with limited images. We also introduce a consistency-aware fine-tuning scheme, which enhances reconstruction details with low computational and time costs. Furthermore, while recent neural rendering-based reconstruction methods have achieved impressive outcomes, they are typically limited to objects with closed surfaces, using Signed Distance Functions (SDF) as the surface representation. To reconstruct surfaces with arbitrary topologies from 2D images, we propose representing surfaces as Unsigned Distance Functions (UDF) and develop a novel volume rendering scheme to learn the neural UDF representation. Our method enables high-quality reconstruction of non-closed shapes with complex topologies while achieving comparable performance to SDF-based methods for closed surfaces. Through these proposed techniques, we aim to advance the field of 3D reconstruction by leveraging learning-based approaches to overcome challenges related to weak textures, limited image sets, and open boundaries. The outcomes of this research have the potential to enhance the quality and fidelity of 3D reconstructions, contributing to the development of various 3D applications and pushing the boundaries of what is achievable in reconstructing the 3D world from 2D images.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshImage processing - Digital techniques-
dc.subject.lcshThree-dimensional imaging-
dc.titleLearning-based 3D depth estimation and surface reconstruction from 2D images-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineComputer Science-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2023-
dc.identifier.mmsid991044717470703414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats