Studies on neural radiance field and its application

Liu, Anran; 劉安然

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Computer Science: Theses

postgraduate thesis: Studies on neural radiance field and its application

Title	Studies on neural radiance field and its application
Authors	Liu, Anran 劉安然
Issue Date	2024
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Liu, A. [劉安然]. (2024). Studies on neural radiance field and its application. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	The surging emergence of implicit neural representation has opened up new possibilities for research in 3D vision. Neural radiance field (NeRF) takes the lead in this technical revolution. NeRF, along with its follow-up works, has greatly boosted the performance in novel view synthesis and 3D reconstruction. It obviates the limitations of traditional explicit representation (e.g., mesh, voxel grid) in terms of large model sizes and limited resolutions, by compactly encoding the geometry and appearance of 3D models into lightweight neural networks. Such a framework enables more flexibility in high-quality modeling with continuous representation. Moreover, the 3D field embedded in NeRF-based models is seamlessly connected to the 2D image space through volume rendering. However, the NeRF framework faces some tough challenges, which greatly hamper it from real-world applications. This thesis targets the essential problem of NeRF in terms of its rendering speed, as well as its applications to some demanding tasks, including camera pose estimation and 3D reconstruction with the awareness of object structures. We have conducted three research works in an attempt to tackle these issues and improve the performance of NeRF-based framework from different aspects. The first work addresses the problem of slow rendering speed, which is mainly caused by the feature queries for the numerous point samples used to render an image. Our key observation is that there are a large portion of overlapping pixels between the adjacent frames on a smooth rendering trajectory, which is common in interactive scenarios. Hence we propose to reuse the features of previous frames to render their subsequent frames by exploiting the geometrical and temporal coherence, saving the time cost by feature query to a large degree. A multiple-plane buffer is developed to record features for efficient reusing, along with a buffer updating mechanism to maintain high rendering quality. Our approach achieves acceleration for complex real-world scenes with a competitive quality. The second work aims to enhance the robustness of camera pose estimation built upon the inverse process of NeRF rendering. We introduce an additional constraint on semantic feature consistency for pose optimization, which is more robust to the appearance changes caused by different viewpoints than the commonly used photometric consistency. A set of strategies are proposed to calculate the corresponding feature loss for individual pixels efficiently. Our method greatly improves the pose estimation accuracy for both single objects and real scenes. The third work focuses on single-view 3D reconstruction, particularly on obtaining 3D models with part information, since the part-awareness can benefit many shape processing tasks. We propose a novel framework for part-aware reconstruction from a single-view image, where multiview images are generated by multiview diffusion based on the single-view image. Then 2D segmentation masks of these multiview images are yielded with a generalizable image segmentation model. These 2D segmentations serve as the guidance for part-aware 3D reconstruction via a NeRF-based method, where a part-aware feature space is built with contrastive learning. Finally, an automatic algorithm is applied to extract the 3D model with high-quality part segments.
Degree	Doctor of Philosophy
Subject	Neural networks (Computer science) Image reconstruction - Digital techniques Three-dimensional imaging
Dept/Program	Computer Science
Persistent Identifier	http://hdl.handle.net/10722/352689

DC Field	Value	Language
dc.contributor.author	Liu, Anran	-
dc.contributor.author	劉安然	-
dc.date.accessioned	2024-12-19T09:27:21Z	-
dc.date.available	2024-12-19T09:27:21Z	-
dc.date.issued	2024	-
dc.identifier.citation	Liu, A. [劉安然]. (2024). Studies on neural radiance field and its application. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/352689	-
dc.description.abstract	The surging emergence of implicit neural representation has opened up new possibilities for research in 3D vision. Neural radiance field (NeRF) takes the lead in this technical revolution. NeRF, along with its follow-up works, has greatly boosted the performance in novel view synthesis and 3D reconstruction. It obviates the limitations of traditional explicit representation (e.g., mesh, voxel grid) in terms of large model sizes and limited resolutions, by compactly encoding the geometry and appearance of 3D models into lightweight neural networks. Such a framework enables more flexibility in high-quality modeling with continuous representation. Moreover, the 3D field embedded in NeRF-based models is seamlessly connected to the 2D image space through volume rendering. However, the NeRF framework faces some tough challenges, which greatly hamper it from real-world applications. This thesis targets the essential problem of NeRF in terms of its rendering speed, as well as its applications to some demanding tasks, including camera pose estimation and 3D reconstruction with the awareness of object structures. We have conducted three research works in an attempt to tackle these issues and improve the performance of NeRF-based framework from different aspects. The first work addresses the problem of slow rendering speed, which is mainly caused by the feature queries for the numerous point samples used to render an image. Our key observation is that there are a large portion of overlapping pixels between the adjacent frames on a smooth rendering trajectory, which is common in interactive scenarios. Hence we propose to reuse the features of previous frames to render their subsequent frames by exploiting the geometrical and temporal coherence, saving the time cost by feature query to a large degree. A multiple-plane buffer is developed to record features for efficient reusing, along with a buffer updating mechanism to maintain high rendering quality. Our approach achieves acceleration for complex real-world scenes with a competitive quality. The second work aims to enhance the robustness of camera pose estimation built upon the inverse process of NeRF rendering. We introduce an additional constraint on semantic feature consistency for pose optimization, which is more robust to the appearance changes caused by different viewpoints than the commonly used photometric consistency. A set of strategies are proposed to calculate the corresponding feature loss for individual pixels efficiently. Our method greatly improves the pose estimation accuracy for both single objects and real scenes. The third work focuses on single-view 3D reconstruction, particularly on obtaining 3D models with part information, since the part-awareness can benefit many shape processing tasks. We propose a novel framework for part-aware reconstruction from a single-view image, where multiview images are generated by multiview diffusion based on the single-view image. Then 2D segmentation masks of these multiview images are yielded with a generalizable image segmentation model. These 2D segmentations serve as the guidance for part-aware 3D reconstruction via a NeRF-based method, where a part-aware feature space is built with contrastive learning. Finally, an automatic algorithm is applied to extract the 3D model with high-quality part segments.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Neural networks (Computer science)	-
dc.subject.lcsh	Image reconstruction - Digital techniques	-
dc.subject.lcsh	Three-dimensional imaging	-
dc.title	Studies on neural radiance field and its application	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Computer Science	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2024	-
dc.identifier.mmsid	991044891408403414	-

File Download

Supplementary

postgraduate thesis: Studies on neural radiance field and its application

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats