File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Studies on neural radiance field and its application
Title | Studies on neural radiance field and its application |
---|---|
Authors | |
Issue Date | 2024 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Liu, A. [劉安然]. (2024). Studies on neural radiance field and its application. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | The surging emergence of implicit neural representation has opened up new possibilities for research in 3D vision. Neural radiance field (NeRF) takes the lead in this technical revolution. NeRF, along with its follow-up works, has greatly boosted the performance in novel view synthesis and 3D reconstruction. It obviates the limitations of traditional explicit representation (e.g., mesh, voxel grid) in terms of large model sizes and limited resolutions, by compactly encoding the geometry and appearance of 3D models into lightweight neural networks. Such a framework enables more flexibility in high-quality modeling with continuous representation. Moreover, the 3D field embedded in NeRF-based models is seamlessly connected to the 2D image space through volume rendering.
However, the NeRF framework faces some tough challenges, which greatly hamper it from real-world applications.
This thesis targets the essential problem of NeRF in terms of its rendering speed, as well as its applications to some demanding tasks, including camera pose estimation and 3D reconstruction with the awareness of object structures. We have conducted three research works in an attempt to tackle these issues and improve the performance of NeRF-based framework from different aspects.
The first work addresses the problem of slow rendering speed, which is mainly caused by the feature queries for the numerous point samples used to render an image. Our key observation is that there are a large portion of overlapping pixels between the adjacent frames on a smooth rendering trajectory, which is common in interactive scenarios. Hence we propose to reuse the features of previous frames to render their subsequent frames by exploiting the geometrical and temporal coherence, saving the time cost by feature query to a large degree. A multiple-plane buffer is developed to record features for efficient reusing, along with a buffer updating mechanism to maintain high rendering quality. Our approach achieves acceleration for complex real-world scenes with a competitive quality.
The second work aims to enhance the robustness of camera pose estimation built upon the inverse process of NeRF rendering. We introduce an additional constraint on semantic feature consistency for pose optimization, which is more robust to the appearance changes caused by different viewpoints than the commonly used photometric consistency. A set of strategies are proposed to calculate the corresponding feature loss for individual pixels efficiently. Our method greatly improves the pose estimation accuracy for both single objects and real scenes.
The third work focuses on single-view 3D reconstruction, particularly on obtaining 3D models with part information, since the part-awareness can benefit many shape processing tasks. We propose a novel framework for part-aware reconstruction from a single-view image, where multiview images are generated by multiview diffusion based on the single-view image. Then 2D segmentation masks of these multiview images are yielded with a generalizable image segmentation model. These 2D segmentations serve as the guidance for part-aware 3D reconstruction via a NeRF-based method, where a part-aware feature space is built with contrastive learning. Finally, an automatic algorithm is applied to extract the 3D model with high-quality part segments. |
Degree | Doctor of Philosophy |
Subject | Neural networks (Computer science) Image reconstruction - Digital techniques Three-dimensional imaging |
Dept/Program | Computer Science |
Persistent Identifier | http://hdl.handle.net/10722/352689 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Liu, Anran | - |
dc.contributor.author | 劉安然 | - |
dc.date.accessioned | 2024-12-19T09:27:21Z | - |
dc.date.available | 2024-12-19T09:27:21Z | - |
dc.date.issued | 2024 | - |
dc.identifier.citation | Liu, A. [劉安然]. (2024). Studies on neural radiance field and its application. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/352689 | - |
dc.description.abstract | The surging emergence of implicit neural representation has opened up new possibilities for research in 3D vision. Neural radiance field (NeRF) takes the lead in this technical revolution. NeRF, along with its follow-up works, has greatly boosted the performance in novel view synthesis and 3D reconstruction. It obviates the limitations of traditional explicit representation (e.g., mesh, voxel grid) in terms of large model sizes and limited resolutions, by compactly encoding the geometry and appearance of 3D models into lightweight neural networks. Such a framework enables more flexibility in high-quality modeling with continuous representation. Moreover, the 3D field embedded in NeRF-based models is seamlessly connected to the 2D image space through volume rendering. However, the NeRF framework faces some tough challenges, which greatly hamper it from real-world applications. This thesis targets the essential problem of NeRF in terms of its rendering speed, as well as its applications to some demanding tasks, including camera pose estimation and 3D reconstruction with the awareness of object structures. We have conducted three research works in an attempt to tackle these issues and improve the performance of NeRF-based framework from different aspects. The first work addresses the problem of slow rendering speed, which is mainly caused by the feature queries for the numerous point samples used to render an image. Our key observation is that there are a large portion of overlapping pixels between the adjacent frames on a smooth rendering trajectory, which is common in interactive scenarios. Hence we propose to reuse the features of previous frames to render their subsequent frames by exploiting the geometrical and temporal coherence, saving the time cost by feature query to a large degree. A multiple-plane buffer is developed to record features for efficient reusing, along with a buffer updating mechanism to maintain high rendering quality. Our approach achieves acceleration for complex real-world scenes with a competitive quality. The second work aims to enhance the robustness of camera pose estimation built upon the inverse process of NeRF rendering. We introduce an additional constraint on semantic feature consistency for pose optimization, which is more robust to the appearance changes caused by different viewpoints than the commonly used photometric consistency. A set of strategies are proposed to calculate the corresponding feature loss for individual pixels efficiently. Our method greatly improves the pose estimation accuracy for both single objects and real scenes. The third work focuses on single-view 3D reconstruction, particularly on obtaining 3D models with part information, since the part-awareness can benefit many shape processing tasks. We propose a novel framework for part-aware reconstruction from a single-view image, where multiview images are generated by multiview diffusion based on the single-view image. Then 2D segmentation masks of these multiview images are yielded with a generalizable image segmentation model. These 2D segmentations serve as the guidance for part-aware 3D reconstruction via a NeRF-based method, where a part-aware feature space is built with contrastive learning. Finally, an automatic algorithm is applied to extract the 3D model with high-quality part segments. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Neural networks (Computer science) | - |
dc.subject.lcsh | Image reconstruction - Digital techniques | - |
dc.subject.lcsh | Three-dimensional imaging | - |
dc.title | Studies on neural radiance field and its application | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Computer Science | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2024 | - |
dc.identifier.mmsid | 991044891408403414 | - |