File Download
Supplementary

postgraduate thesis: Single view analysis of non-Lambertian objects based on deep learning

TitleSingle view analysis of non-Lambertian objects based on deep learning
Authors
Advisors
Advisor(s):Wong, KKY
Issue Date2020
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Chen, G. [陈冠英]. (2020). Single view analysis of non-Lambertian objects based on deep learning. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractNon-Lambertian objects (e.g., transparent objects and specular objects) are very common in the real-world. However, existing computer vision algorithms developed for scene analysis often assume a Lambertian reflectance model, and treat non-Lambertian objects as outliers. It is important to develop robust methods for analysing non-Lambertian objects as it enables more complete and accurate understanding of the captured scene. This thesis tackles three vision problems of non-Lambertian objects under a single viewpoint, namely transparent object matting, calibrated photometric stereo, and uncalibrated photometric stereo for non-Lambertian objects. The first part of this thesis addresses the problem of transparent object matting. Existing approaches often require tedious capturing procedures and long processing time, which limit their practical use. We formulate transparent object matting as a refractive flow estimation problem, and propose a deep learning framework, named TOM-Net, for learning the refractive flow. At test time, TOM-Net takes a single image as input, and outputs a matte (consisting of an object mask, an attenuation mask and a refractive flow field) in a fast feed-forward pass. As no off-the-shelf dataset is available for transparent object matting, we create a large-scale synthetic dataset for training and capture a real dataset for testing. Besides, we show that our method can be easily extended to handle cases where a trimap or a background image is available. The second part of this thesis addresses the problem of calibrated photometric stereo for non-Lambertian surfaces. Existing approaches often adopt simplified reflectance models to make the problem more tractable, but this greatly hinders their applications on real-world objects. We propose a deep fully convolutional network, named PS-FCN, that takes an arbitrary number of images of a static object captured under different light directions with a fixed camera as input, and predicts a normal map of the object in a fast feed-forward pass. Our method does not depend on a pre-defined set of light directions during training and testing. The third part of this thesis considers the problem of uncalibrated photometric stereo, where light directions are unknown at test time. Specifically, we focus on estimating light directions from the input images, through which we cast the problem of uncalibrated photometric stereo into a calibrated one. We first introduce a novel convolutional network, named LCNet, to estimate light directions from input images. Unlike previous approaches that heavily rely on assumptions of specific reflectances and light source distributions, our method is able to determine light directions of a scene with unknown arbitrary reflectances observed under unknown varying light directions. We then analyse what had been learned by LCNet to resolve the ambiguity in lighting estimation. Inspired by our observations, we further introduce a guided calibration network (GCNet) to estimate more accurate lightings.
DegreeDoctor of Philosophy
SubjectPattern perception
Machine learning
Dept/ProgramComputer Science
Persistent Identifierhttp://hdl.handle.net/10722/297551

 

DC FieldValueLanguage
dc.contributor.advisorWong, KKY-
dc.contributor.authorChen, Guanying-
dc.contributor.author陈冠英-
dc.date.accessioned2021-03-21T11:38:05Z-
dc.date.available2021-03-21T11:38:05Z-
dc.date.issued2020-
dc.identifier.citationChen, G. [陈冠英]. (2020). Single view analysis of non-Lambertian objects based on deep learning. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/297551-
dc.description.abstractNon-Lambertian objects (e.g., transparent objects and specular objects) are very common in the real-world. However, existing computer vision algorithms developed for scene analysis often assume a Lambertian reflectance model, and treat non-Lambertian objects as outliers. It is important to develop robust methods for analysing non-Lambertian objects as it enables more complete and accurate understanding of the captured scene. This thesis tackles three vision problems of non-Lambertian objects under a single viewpoint, namely transparent object matting, calibrated photometric stereo, and uncalibrated photometric stereo for non-Lambertian objects. The first part of this thesis addresses the problem of transparent object matting. Existing approaches often require tedious capturing procedures and long processing time, which limit their practical use. We formulate transparent object matting as a refractive flow estimation problem, and propose a deep learning framework, named TOM-Net, for learning the refractive flow. At test time, TOM-Net takes a single image as input, and outputs a matte (consisting of an object mask, an attenuation mask and a refractive flow field) in a fast feed-forward pass. As no off-the-shelf dataset is available for transparent object matting, we create a large-scale synthetic dataset for training and capture a real dataset for testing. Besides, we show that our method can be easily extended to handle cases where a trimap or a background image is available. The second part of this thesis addresses the problem of calibrated photometric stereo for non-Lambertian surfaces. Existing approaches often adopt simplified reflectance models to make the problem more tractable, but this greatly hinders their applications on real-world objects. We propose a deep fully convolutional network, named PS-FCN, that takes an arbitrary number of images of a static object captured under different light directions with a fixed camera as input, and predicts a normal map of the object in a fast feed-forward pass. Our method does not depend on a pre-defined set of light directions during training and testing. The third part of this thesis considers the problem of uncalibrated photometric stereo, where light directions are unknown at test time. Specifically, we focus on estimating light directions from the input images, through which we cast the problem of uncalibrated photometric stereo into a calibrated one. We first introduce a novel convolutional network, named LCNet, to estimate light directions from input images. Unlike previous approaches that heavily rely on assumptions of specific reflectances and light source distributions, our method is able to determine light directions of a scene with unknown arbitrary reflectances observed under unknown varying light directions. We then analyse what had been learned by LCNet to resolve the ambiguity in lighting estimation. Inspired by our observations, we further introduce a guided calibration network (GCNet) to estimate more accurate lightings.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshPattern perception-
dc.subject.lcshMachine learning-
dc.titleSingle view analysis of non-Lambertian objects based on deep learning-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineComputer Science-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2021-
dc.identifier.mmsid991044351383503414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats