File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Geometric representation learning for light field restoration
Title | Geometric representation learning for light field restoration |
---|---|
Authors | |
Advisors | |
Issue Date | 2020 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Meng, N. [孟楠]. (2020). Geometric representation learning for light field restoration. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Light field imaging is emerging as a promising paradigm that captures both the intensity and the direction of incoming light rays. The directional redundancy information provides all manner of devices with a sense of depth perception for the world around them. In that regard, this high-dimensional (HD) representation of visual data allows powerful potentials for vision application, such as scene understanding, depth sensing, segmentation, post-capture refocusing, etc. In return, however, such high dimensionality also increases the difficulty in terms of data analysis and processing. For years, constructing a suitable system always requires considerable expertise and elaborate engineering to transform the raw data into an internal representation.
This dissertation is in pursuit of a learning method to automatically extract the efficient geometric representations from a light field with little engineering by hand. Exploiting amounts of data in conjunction with increasing available computations, the learning method possesses a powerful capacity to automatically extract more representative features for a wide range of applications and therefore liberates a lot of manpower.
Challenges of designing an efficient learning framework for light fields mainly fall into two aspects. First of all, given the inherent high dimensionality, existing 2D or 3D convolutional neural networks (CNNs) can hardly be introduced to process the light field data. On the other hand, a desirable framework for light field feature extraction should consider the data coherence and preserve the light field structural property.
To mitigate the aforementioned problems, in this dissertation, we start with establishing the cornerstone for light field processing, i.e. the HD convolutional (HConv) layer. In analogy with the 2D CNN layer, a HConv layer can be considered as the basic processing unit for light field data, due to its ability to learn the completely spatio-angular structure and redundancy. With the powerful building block, we further design and implement efficient structure for the 4D learning frameworks based on the HConv layer. The high-order residual block (HRB) is therefore proposed to extract the geometric features maintaining the angular coherence. The HRB not only learns the representations but also has the capacity to propagate geometric information. As a consequence, by simply stacking multiple HRBs, the deep framework is able to extract diverse spatial features with the structural properties encoded.
Along with the model design, we also systematically analyze the effects of different hyper-parameters, the convergence property, the visual impacts from different loss terms. To ensure the convergence and ease the training process, we propose the aperture group batch normalization algorithm to normalize the values over a group of aperture images in feature maps. To ensure the robustness to different parallax of light field images, we design a multi-range training strategy for our HD learning framework. Extensive experimental results and analysis illustrate that the proposed model generates high-quality reconstruction results even over the challenging regions with complex occlusions and non-Lambertian surface. Finally, we further explore depth-free techniques to learn the continuous representations of the light field and achieve state-of-the-art results. (492 words) |
Degree | Doctor of Philosophy |
Subject | Image processing |
Dept/Program | Electrical and Electronic Engineering |
Persistent Identifier | http://hdl.handle.net/10722/295569 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Lam, EYM | - |
dc.contributor.advisor | So, HKH | - |
dc.contributor.author | Meng, Nan | - |
dc.contributor.author | 孟楠 | - |
dc.date.accessioned | 2021-01-29T05:10:37Z | - |
dc.date.available | 2021-01-29T05:10:37Z | - |
dc.date.issued | 2020 | - |
dc.identifier.citation | Meng, N. [孟楠]. (2020). Geometric representation learning for light field restoration. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/295569 | - |
dc.description.abstract | Light field imaging is emerging as a promising paradigm that captures both the intensity and the direction of incoming light rays. The directional redundancy information provides all manner of devices with a sense of depth perception for the world around them. In that regard, this high-dimensional (HD) representation of visual data allows powerful potentials for vision application, such as scene understanding, depth sensing, segmentation, post-capture refocusing, etc. In return, however, such high dimensionality also increases the difficulty in terms of data analysis and processing. For years, constructing a suitable system always requires considerable expertise and elaborate engineering to transform the raw data into an internal representation. This dissertation is in pursuit of a learning method to automatically extract the efficient geometric representations from a light field with little engineering by hand. Exploiting amounts of data in conjunction with increasing available computations, the learning method possesses a powerful capacity to automatically extract more representative features for a wide range of applications and therefore liberates a lot of manpower. Challenges of designing an efficient learning framework for light fields mainly fall into two aspects. First of all, given the inherent high dimensionality, existing 2D or 3D convolutional neural networks (CNNs) can hardly be introduced to process the light field data. On the other hand, a desirable framework for light field feature extraction should consider the data coherence and preserve the light field structural property. To mitigate the aforementioned problems, in this dissertation, we start with establishing the cornerstone for light field processing, i.e. the HD convolutional (HConv) layer. In analogy with the 2D CNN layer, a HConv layer can be considered as the basic processing unit for light field data, due to its ability to learn the completely spatio-angular structure and redundancy. With the powerful building block, we further design and implement efficient structure for the 4D learning frameworks based on the HConv layer. The high-order residual block (HRB) is therefore proposed to extract the geometric features maintaining the angular coherence. The HRB not only learns the representations but also has the capacity to propagate geometric information. As a consequence, by simply stacking multiple HRBs, the deep framework is able to extract diverse spatial features with the structural properties encoded. Along with the model design, we also systematically analyze the effects of different hyper-parameters, the convergence property, the visual impacts from different loss terms. To ensure the convergence and ease the training process, we propose the aperture group batch normalization algorithm to normalize the values over a group of aperture images in feature maps. To ensure the robustness to different parallax of light field images, we design a multi-range training strategy for our HD learning framework. Extensive experimental results and analysis illustrate that the proposed model generates high-quality reconstruction results even over the challenging regions with complex occlusions and non-Lambertian surface. Finally, we further explore depth-free techniques to learn the continuous representations of the light field and achieve state-of-the-art results. (492 words) | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Image processing | - |
dc.title | Geometric representation learning for light field restoration | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Electrical and Electronic Engineering | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2020 | - |
dc.identifier.mmsid | 991044306519503414 | - |