Single view analysis of non-Lambertian objects based on deep learning

Chen, Guanying; 陈冠英

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Computer Science: Theses

postgraduate thesis: Single view analysis of non-Lambertian objects based on deep learning

Title	Single view analysis of non-Lambertian objects based on deep learning
Authors	Chen, Guanying 陈冠英
Advisors	Advisor(s):Wong, KKY
Issue Date	2020
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Chen, G. [陈冠英]. (2020). Single view analysis of non-Lambertian objects based on deep learning. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Non-Lambertian objects (e.g., transparent objects and specular objects) are very common in the real-world. However, existing computer vision algorithms developed for scene analysis often assume a Lambertian reflectance model, and treat non-Lambertian objects as outliers. It is important to develop robust methods for analysing non-Lambertian objects as it enables more complete and accurate understanding of the captured scene. This thesis tackles three vision problems of non-Lambertian objects under a single viewpoint, namely transparent object matting, calibrated photometric stereo, and uncalibrated photometric stereo for non-Lambertian objects. The first part of this thesis addresses the problem of transparent object matting. Existing approaches often require tedious capturing procedures and long processing time, which limit their practical use. We formulate transparent object matting as a refractive flow estimation problem, and propose a deep learning framework, named TOM-Net, for learning the refractive flow. At test time, TOM-Net takes a single image as input, and outputs a matte (consisting of an object mask, an attenuation mask and a refractive flow field) in a fast feed-forward pass. As no off-the-shelf dataset is available for transparent object matting, we create a large-scale synthetic dataset for training and capture a real dataset for testing. Besides, we show that our method can be easily extended to handle cases where a trimap or a background image is available. The second part of this thesis addresses the problem of calibrated photometric stereo for non-Lambertian surfaces. Existing approaches often adopt simplified reflectance models to make the problem more tractable, but this greatly hinders their applications on real-world objects. We propose a deep fully convolutional network, named PS-FCN, that takes an arbitrary number of images of a static object captured under different light directions with a fixed camera as input, and predicts a normal map of the object in a fast feed-forward pass. Our method does not depend on a pre-defined set of light directions during training and testing. The third part of this thesis considers the problem of uncalibrated photometric stereo, where light directions are unknown at test time. Specifically, we focus on estimating light directions from the input images, through which we cast the problem of uncalibrated photometric stereo into a calibrated one. We first introduce a novel convolutional network, named LCNet, to estimate light directions from input images. Unlike previous approaches that heavily rely on assumptions of specific reflectances and light source distributions, our method is able to determine light directions of a scene with unknown arbitrary reflectances observed under unknown varying light directions. We then analyse what had been learned by LCNet to resolve the ambiguity in lighting estimation. Inspired by our observations, we further introduce a guided calibration network (GCNet) to estimate more accurate lightings.
Degree	Doctor of Philosophy
Subject	Pattern perception Machine learning
Dept/Program	Computer Science
Persistent Identifier	http://hdl.handle.net/10722/297551

DC Field	Value	Language
dc.contributor.advisor	Wong, KKY	-
dc.contributor.author	Chen, Guanying	-
dc.contributor.author	陈冠英	-
dc.date.accessioned	2021-03-21T11:38:05Z	-
dc.date.available	2021-03-21T11:38:05Z	-
dc.date.issued	2020	-
dc.identifier.citation	Chen, G. [陈冠英]. (2020). Single view analysis of non-Lambertian objects based on deep learning. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/297551	-
dc.description.abstract	Non-Lambertian objects (e.g., transparent objects and specular objects) are very common in the real-world. However, existing computer vision algorithms developed for scene analysis often assume a Lambertian reflectance model, and treat non-Lambertian objects as outliers. It is important to develop robust methods for analysing non-Lambertian objects as it enables more complete and accurate understanding of the captured scene. This thesis tackles three vision problems of non-Lambertian objects under a single viewpoint, namely transparent object matting, calibrated photometric stereo, and uncalibrated photometric stereo for non-Lambertian objects. The first part of this thesis addresses the problem of transparent object matting. Existing approaches often require tedious capturing procedures and long processing time, which limit their practical use. We formulate transparent object matting as a refractive flow estimation problem, and propose a deep learning framework, named TOM-Net, for learning the refractive flow. At test time, TOM-Net takes a single image as input, and outputs a matte (consisting of an object mask, an attenuation mask and a refractive flow field) in a fast feed-forward pass. As no off-the-shelf dataset is available for transparent object matting, we create a large-scale synthetic dataset for training and capture a real dataset for testing. Besides, we show that our method can be easily extended to handle cases where a trimap or a background image is available. The second part of this thesis addresses the problem of calibrated photometric stereo for non-Lambertian surfaces. Existing approaches often adopt simplified reflectance models to make the problem more tractable, but this greatly hinders their applications on real-world objects. We propose a deep fully convolutional network, named PS-FCN, that takes an arbitrary number of images of a static object captured under different light directions with a fixed camera as input, and predicts a normal map of the object in a fast feed-forward pass. Our method does not depend on a pre-defined set of light directions during training and testing. The third part of this thesis considers the problem of uncalibrated photometric stereo, where light directions are unknown at test time. Specifically, we focus on estimating light directions from the input images, through which we cast the problem of uncalibrated photometric stereo into a calibrated one. We first introduce a novel convolutional network, named LCNet, to estimate light directions from input images. Unlike previous approaches that heavily rely on assumptions of specific reflectances and light source distributions, our method is able to determine light directions of a scene with unknown arbitrary reflectances observed under unknown varying light directions. We then analyse what had been learned by LCNet to resolve the ambiguity in lighting estimation. Inspired by our observations, we further introduce a guided calibration network (GCNet) to estimate more accurate lightings.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Pattern perception	-
dc.subject.lcsh	Machine learning	-
dc.title	Single view analysis of non-Lambertian objects based on deep learning	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Computer Science	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2021	-
dc.identifier.mmsid	991044351383503414	-

File Download

Supplementary

postgraduate thesis: Single view analysis of non-Lambertian objects based on deep learning

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats