Face sketch synthesis and face super resolution in the wild with deep learning

Chen Chaofeng; 陳超鋒

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Computer Science: Theses

postgraduate thesis: Face sketch synthesis and face super resolution in the wild with deep learning

Title	Face sketch synthesis and face super resolution in the wild with deep learning
Authors	Chen Chaofeng 陳超鋒
Advisors	Advisor(s):Wong, KKY
Issue Date	2020
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Chen Chaofeng, [陳超鋒]. (2020). Face sketch synthesis and face super resolution in the wild with deep learning. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Face image processing has been one of the most important problems in computer vision since it is closely related to our daily lives. However, there still exist many challenges in handling natural face images in the wild. In this thesis, we will explore deep learning approaches on two face image processing tasks, namely face sketch synthesis and face super resolution. With the powerful representation ability of deep neural networks, we are able to narrow the gap between research and real world applications in these two tasks. Face sketch synthesis targets at generating a sketch from an input face photo. We first introduce a cascaded framework to synthesize face sketches which imitates the process of how artists draw sketches. A content image is first generated through a fully convolutional neural network (FCNN) that outlines the shape of the face, and then textures and shadings are added to enrich the details of the sketch with a style transfer approach. Such cascaded framework not only helps to preserve more sketch details than the common style transfer method but also surpasses traditional patch based methods. To process natural face images, we then presents a semi-supervised deep learning architecture which extends face sketch synthesis to handle face photos in the wild by exploiting additional face photos in training. Instead of supervising the network with ground truth sketches, we first perform patch matching in feature space between the input photo and photos in a small reference set of photo-sketch pairs. We then compose a pseudo sketch feature representation using the corresponding sketch feature patches to supervise our photo-to-sketch network. With the proposed approach, we can train our networks using a small reference set of photo-sketch pairs together with a large face photo dataset without ground truth sketches. Besides, we also jointly learn a sketch-to-photo network based on self-supervision of input photos. Then, we focus on face super resolution which refers to generating high resolution face images from the corresponding low resolution inputs. We introduce a novel SPatial Attention Residual Network (SPARNet) built on the proposed Face Attention Units (FAUs) for face super resolution. FAUs improve the vanilla residual blocks with spatial attention mechanism, and enable the convolutional layers to adaptively bootstrap features related to the key face structures and pay less attention to those less feature-rich regions. We further extend SPARNet with multi-scale discriminators, named as SPARNetHD, to produce high resolution results (i.e., 512x512). We show that SPARNetHD trained with synthetic data can not only produce high quality and high resolution outputs for synthetically degraded face images, but also show good generalization ability to real world low quality face images.
Degree	Doctor of Philosophy
Subject	Machine learning Image processing - Digital techniques
Dept/Program	Computer Science
Persistent Identifier	http://hdl.handle.net/10722/297490

DC Field	Value	Language
dc.contributor.advisor	Wong, KKY	-
dc.contributor.author	Chen Chaofeng	-
dc.contributor.author	陳超鋒	-
dc.date.accessioned	2021-03-21T11:37:57Z	-
dc.date.available	2021-03-21T11:37:57Z	-
dc.date.issued	2020	-
dc.identifier.citation	Chen Chaofeng, [陳超鋒]. (2020). Face sketch synthesis and face super resolution in the wild with deep learning. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/297490	-
dc.description.abstract	Face image processing has been one of the most important problems in computer vision since it is closely related to our daily lives. However, there still exist many challenges in handling natural face images in the wild. In this thesis, we will explore deep learning approaches on two face image processing tasks, namely face sketch synthesis and face super resolution. With the powerful representation ability of deep neural networks, we are able to narrow the gap between research and real world applications in these two tasks. Face sketch synthesis targets at generating a sketch from an input face photo. We first introduce a cascaded framework to synthesize face sketches which imitates the process of how artists draw sketches. A content image is first generated through a fully convolutional neural network (FCNN) that outlines the shape of the face, and then textures and shadings are added to enrich the details of the sketch with a style transfer approach. Such cascaded framework not only helps to preserve more sketch details than the common style transfer method but also surpasses traditional patch based methods. To process natural face images, we then presents a semi-supervised deep learning architecture which extends face sketch synthesis to handle face photos in the wild by exploiting additional face photos in training. Instead of supervising the network with ground truth sketches, we first perform patch matching in feature space between the input photo and photos in a small reference set of photo-sketch pairs. We then compose a pseudo sketch feature representation using the corresponding sketch feature patches to supervise our photo-to-sketch network. With the proposed approach, we can train our networks using a small reference set of photo-sketch pairs together with a large face photo dataset without ground truth sketches. Besides, we also jointly learn a sketch-to-photo network based on self-supervision of input photos. Then, we focus on face super resolution which refers to generating high resolution face images from the corresponding low resolution inputs. We introduce a novel SPatial Attention Residual Network (SPARNet) built on the proposed Face Attention Units (FAUs) for face super resolution. FAUs improve the vanilla residual blocks with spatial attention mechanism, and enable the convolutional layers to adaptively bootstrap features related to the key face structures and pay less attention to those less feature-rich regions. We further extend SPARNet with multi-scale discriminators, named as SPARNetHD, to produce high resolution results (i.e., 512x512). We show that SPARNetHD trained with synthetic data can not only produce high quality and high resolution outputs for synthetically degraded face images, but also show good generalization ability to real world low quality face images.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Machine learning	-
dc.subject.lcsh	Image processing - Digital techniques	-
dc.title	Face sketch synthesis and face super resolution in the wild with deep learning	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Computer Science	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2021	-
dc.identifier.mmsid	991044351383603414	-

File Download

Supplementary

postgraduate thesis: Face sketch synthesis and face super resolution in the wild with deep learning

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats