File Download
Supplementary

postgraduate thesis: Studies on neural radiance field and its application

TitleStudies on neural radiance field and its application
Authors
Issue Date2024
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Liu, A. [劉安然]. (2024). Studies on neural radiance field and its application. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractThe surging emergence of implicit neural representation has opened up new possibilities for research in 3D vision. Neural radiance field (NeRF) takes the lead in this technical revolution. NeRF, along with its follow-up works, has greatly boosted the performance in novel view synthesis and 3D reconstruction. It obviates the limitations of traditional explicit representation (e.g., mesh, voxel grid) in terms of large model sizes and limited resolutions, by compactly encoding the geometry and appearance of 3D models into lightweight neural networks. Such a framework enables more flexibility in high-quality modeling with continuous representation. Moreover, the 3D field embedded in NeRF-based models is seamlessly connected to the 2D image space through volume rendering. However, the NeRF framework faces some tough challenges, which greatly hamper it from real-world applications. This thesis targets the essential problem of NeRF in terms of its rendering speed, as well as its applications to some demanding tasks, including camera pose estimation and 3D reconstruction with the awareness of object structures. We have conducted three research works in an attempt to tackle these issues and improve the performance of NeRF-based framework from different aspects. The first work addresses the problem of slow rendering speed, which is mainly caused by the feature queries for the numerous point samples used to render an image. Our key observation is that there are a large portion of overlapping pixels between the adjacent frames on a smooth rendering trajectory, which is common in interactive scenarios. Hence we propose to reuse the features of previous frames to render their subsequent frames by exploiting the geometrical and temporal coherence, saving the time cost by feature query to a large degree. A multiple-plane buffer is developed to record features for efficient reusing, along with a buffer updating mechanism to maintain high rendering quality. Our approach achieves acceleration for complex real-world scenes with a competitive quality. The second work aims to enhance the robustness of camera pose estimation built upon the inverse process of NeRF rendering. We introduce an additional constraint on semantic feature consistency for pose optimization, which is more robust to the appearance changes caused by different viewpoints than the commonly used photometric consistency. A set of strategies are proposed to calculate the corresponding feature loss for individual pixels efficiently. Our method greatly improves the pose estimation accuracy for both single objects and real scenes. The third work focuses on single-view 3D reconstruction, particularly on obtaining 3D models with part information, since the part-awareness can benefit many shape processing tasks. We propose a novel framework for part-aware reconstruction from a single-view image, where multiview images are generated by multiview diffusion based on the single-view image. Then 2D segmentation masks of these multiview images are yielded with a generalizable image segmentation model. These 2D segmentations serve as the guidance for part-aware 3D reconstruction via a NeRF-based method, where a part-aware feature space is built with contrastive learning. Finally, an automatic algorithm is applied to extract the 3D model with high-quality part segments.
DegreeDoctor of Philosophy
SubjectNeural networks (Computer science)
Image reconstruction - Digital techniques
Three-dimensional imaging
Dept/ProgramComputer Science
Persistent Identifierhttp://hdl.handle.net/10722/352689

 

DC FieldValueLanguage
dc.contributor.authorLiu, Anran-
dc.contributor.author劉安然-
dc.date.accessioned2024-12-19T09:27:21Z-
dc.date.available2024-12-19T09:27:21Z-
dc.date.issued2024-
dc.identifier.citationLiu, A. [劉安然]. (2024). Studies on neural radiance field and its application. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/352689-
dc.description.abstractThe surging emergence of implicit neural representation has opened up new possibilities for research in 3D vision. Neural radiance field (NeRF) takes the lead in this technical revolution. NeRF, along with its follow-up works, has greatly boosted the performance in novel view synthesis and 3D reconstruction. It obviates the limitations of traditional explicit representation (e.g., mesh, voxel grid) in terms of large model sizes and limited resolutions, by compactly encoding the geometry and appearance of 3D models into lightweight neural networks. Such a framework enables more flexibility in high-quality modeling with continuous representation. Moreover, the 3D field embedded in NeRF-based models is seamlessly connected to the 2D image space through volume rendering. However, the NeRF framework faces some tough challenges, which greatly hamper it from real-world applications. This thesis targets the essential problem of NeRF in terms of its rendering speed, as well as its applications to some demanding tasks, including camera pose estimation and 3D reconstruction with the awareness of object structures. We have conducted three research works in an attempt to tackle these issues and improve the performance of NeRF-based framework from different aspects. The first work addresses the problem of slow rendering speed, which is mainly caused by the feature queries for the numerous point samples used to render an image. Our key observation is that there are a large portion of overlapping pixels between the adjacent frames on a smooth rendering trajectory, which is common in interactive scenarios. Hence we propose to reuse the features of previous frames to render their subsequent frames by exploiting the geometrical and temporal coherence, saving the time cost by feature query to a large degree. A multiple-plane buffer is developed to record features for efficient reusing, along with a buffer updating mechanism to maintain high rendering quality. Our approach achieves acceleration for complex real-world scenes with a competitive quality. The second work aims to enhance the robustness of camera pose estimation built upon the inverse process of NeRF rendering. We introduce an additional constraint on semantic feature consistency for pose optimization, which is more robust to the appearance changes caused by different viewpoints than the commonly used photometric consistency. A set of strategies are proposed to calculate the corresponding feature loss for individual pixels efficiently. Our method greatly improves the pose estimation accuracy for both single objects and real scenes. The third work focuses on single-view 3D reconstruction, particularly on obtaining 3D models with part information, since the part-awareness can benefit many shape processing tasks. We propose a novel framework for part-aware reconstruction from a single-view image, where multiview images are generated by multiview diffusion based on the single-view image. Then 2D segmentation masks of these multiview images are yielded with a generalizable image segmentation model. These 2D segmentations serve as the guidance for part-aware 3D reconstruction via a NeRF-based method, where a part-aware feature space is built with contrastive learning. Finally, an automatic algorithm is applied to extract the 3D model with high-quality part segments.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshNeural networks (Computer science)-
dc.subject.lcshImage reconstruction - Digital techniques-
dc.subject.lcshThree-dimensional imaging-
dc.titleStudies on neural radiance field and its application-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineComputer Science-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2024-
dc.identifier.mmsid991044891408403414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats