File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Face sketch synthesis and face super resolution in the wild with deep learning
Title | Face sketch synthesis and face super resolution in the wild with deep learning |
---|---|
Authors | |
Advisors | Advisor(s):Wong, KKY |
Issue Date | 2020 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Chen Chaofeng, [陳超鋒]. (2020). Face sketch synthesis and face super resolution in the wild with deep learning. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Face image processing has been one of the most important problems in computer vision since it is closely related to our daily lives. However, there still exist many challenges in handling natural face images in the wild. In this thesis, we will explore deep learning approaches on two face image processing tasks, namely face sketch synthesis and face super resolution. With the powerful representation ability of deep neural networks, we are able to narrow the gap between research and real world applications in these two tasks.
Face sketch synthesis targets at generating a sketch from an input face photo. We first introduce a cascaded framework to synthesize face sketches which imitates the process of how artists draw sketches. A content image is first generated through a fully convolutional neural network (FCNN) that outlines the shape of the face, and then textures and shadings are added to enrich the details of the sketch with a style transfer approach. Such cascaded framework not only helps to preserve more sketch details than the common style transfer method but also surpasses traditional patch based methods. To process natural face images, we then presents a semi-supervised deep learning architecture which extends face sketch synthesis to handle face photos in the wild by exploiting additional face photos in training. Instead of supervising the network with ground truth sketches, we first perform patch matching in feature space between the input photo and photos in a small reference set of photo-sketch pairs. We then compose a pseudo sketch feature representation using the corresponding sketch feature patches to supervise our photo-to-sketch network. With the proposed approach, we can train our networks using a small reference set of photo-sketch pairs together with a large face photo dataset without ground truth sketches. Besides, we also jointly learn a sketch-to-photo network based on self-supervision of input photos.
Then, we focus on face super resolution which refers to generating high resolution face images from the corresponding low resolution inputs. We introduce a novel SPatial Attention Residual Network (SPARNet) built on the proposed Face Attention Units (FAUs) for face super resolution. FAUs improve the vanilla residual blocks with spatial attention mechanism, and enable the convolutional layers to adaptively bootstrap features related to the key face structures and pay less attention to those less feature-rich regions. We further extend SPARNet with multi-scale discriminators, named as SPARNetHD, to produce high resolution results (i.e., 512x512). We show that SPARNetHD trained with synthetic data can not only produce high quality and high resolution outputs for synthetically degraded face images, but also show good generalization ability to real world low quality face images. |
Degree | Doctor of Philosophy |
Subject | Machine learning Image processing - Digital techniques |
Dept/Program | Computer Science |
Persistent Identifier | http://hdl.handle.net/10722/297490 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Wong, KKY | - |
dc.contributor.author | Chen Chaofeng | - |
dc.contributor.author | 陳超鋒 | - |
dc.date.accessioned | 2021-03-21T11:37:57Z | - |
dc.date.available | 2021-03-21T11:37:57Z | - |
dc.date.issued | 2020 | - |
dc.identifier.citation | Chen Chaofeng, [陳超鋒]. (2020). Face sketch synthesis and face super resolution in the wild with deep learning. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/297490 | - |
dc.description.abstract | Face image processing has been one of the most important problems in computer vision since it is closely related to our daily lives. However, there still exist many challenges in handling natural face images in the wild. In this thesis, we will explore deep learning approaches on two face image processing tasks, namely face sketch synthesis and face super resolution. With the powerful representation ability of deep neural networks, we are able to narrow the gap between research and real world applications in these two tasks. Face sketch synthesis targets at generating a sketch from an input face photo. We first introduce a cascaded framework to synthesize face sketches which imitates the process of how artists draw sketches. A content image is first generated through a fully convolutional neural network (FCNN) that outlines the shape of the face, and then textures and shadings are added to enrich the details of the sketch with a style transfer approach. Such cascaded framework not only helps to preserve more sketch details than the common style transfer method but also surpasses traditional patch based methods. To process natural face images, we then presents a semi-supervised deep learning architecture which extends face sketch synthesis to handle face photos in the wild by exploiting additional face photos in training. Instead of supervising the network with ground truth sketches, we first perform patch matching in feature space between the input photo and photos in a small reference set of photo-sketch pairs. We then compose a pseudo sketch feature representation using the corresponding sketch feature patches to supervise our photo-to-sketch network. With the proposed approach, we can train our networks using a small reference set of photo-sketch pairs together with a large face photo dataset without ground truth sketches. Besides, we also jointly learn a sketch-to-photo network based on self-supervision of input photos. Then, we focus on face super resolution which refers to generating high resolution face images from the corresponding low resolution inputs. We introduce a novel SPatial Attention Residual Network (SPARNet) built on the proposed Face Attention Units (FAUs) for face super resolution. FAUs improve the vanilla residual blocks with spatial attention mechanism, and enable the convolutional layers to adaptively bootstrap features related to the key face structures and pay less attention to those less feature-rich regions. We further extend SPARNet with multi-scale discriminators, named as SPARNetHD, to produce high resolution results (i.e., 512x512). We show that SPARNetHD trained with synthetic data can not only produce high quality and high resolution outputs for synthetically degraded face images, but also show good generalization ability to real world low quality face images. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Machine learning | - |
dc.subject.lcsh | Image processing - Digital techniques | - |
dc.title | Face sketch synthesis and face super resolution in the wild with deep learning | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Computer Science | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2021 | - |
dc.identifier.mmsid | 991044351383603414 | - |