Deep learning based stylization and smoothing for images and videos

Zhu, Feida; 朱飞达

File Download

FullText.pdf

Links for fulltext

(May Require Subscription)

DOI: 10.5353/th_991044081524003414

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Computer Science: Theses

postgraduate thesis: Deep learning based stylization and smoothing for images and videos

Title	Deep learning based stylization and smoothing for images and videos
Authors	Zhu, Feida 朱飞达
Advisors	Advisor(s):Yu, Y
Issue Date	2018
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Zhu, F. [朱飞达]. (2018). Deep learning based stylization and smoothing for images and videos. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	In recent years, with the prosperity of digital imaging devices and social network, tremendous photographs have been recorded and sharing photos through the social media has been quite popular. Intelligent techniques of image analysis and enhancement have been increasingly important. Many efforts from the research and industry community have been made to push the development of state-of-the-art algorithms of automatic image and video stylization and edge-preserving image smoothing. Images and video stylization strives to enhance unique themes with artistic color and tone adjustments. Edge-preserving image smoothing is to preserve major image structures, such as salient edges and contours, while eliminating insignificant details. This thesis consists of novel deep learning based image and video stylization algorithms and a benchmark for edge-preserving image smoothing. For image and video stylization, mainstream photo enhancement softwares, such as Adobe Photoshop and Instagram, provide users with predefined styles. However, such photo adjustment tools lack a semantic understanding of image contents and the resulting global color transform limits the range of artistic styles it can represent. More advanced stylistic enhancement needs to apply distinct adjustments to various semantic regions. We propose a novel deep learning architecture for exemplar-based image stylization, which learns local enhancement styles from image pairs. Our deep learning architecture consists of fully convolutional networks (FCNs) for automatic semantics-aware feature extraction and fully connected neural layers for adjustment prediction. Image stylization can be efficiently accomplished with a single forward pass through our deep network. To extend our deep network from image stylization to video stylization, we exploit temporal superpixels (TSPs) to facilitate the transfer of artistic styles from image exemplars to videos. Experiments on a number of datasets for image stylization as well as a diverse set of video clips demonstrate the effectiveness of our deep learning architecture. For edge-preserving image smoothing, it is a fundamental problem in image processing and low-level computer vision. At present, there are reasons that seriously hinder its further development. First, its performance evaluation remains subjective. Second, there does not exist widely accepted datasets for both evaluation and research. Third, most existing algorithms cannot perform well on a wide range of image contents using a single parameter setting. To remove the aforementioned hurdles in performance evaluation and further advance the state of the art, we propose a benchmark for edge-preserving image smoothing. This benchmark includes an image dataset with ``groundtruth" image smoothing results as well as baseline learning algorithms that produce models capable of generating reasonable edge-preserving smoothing results for a wide range of image contents. Our image dataset contains 500 training and testing images with a large number of representative visual object categories. The baseline methods in our benchmark are existing representative deep convolutional network architectures, on top of which we design novel loss functions well suited for edge-preserving image smoothing. The trained deep networks run faster than most state-of-the-art smoothing algorithms while the smoothing performance of our ResNet-based model outperforms such algorithms both qualitatively and quantitatively.
Degree	Doctor of Philosophy
Subject	Image processing - Digital techniques Video recordings - Editing - Digital techniques
Dept/Program	Computer Science
Persistent Identifier	http://hdl.handle.net/10722/267754

DC Field	Value	Language
dc.contributor.advisor	Yu, Y	-
dc.contributor.author	Zhu, Feida	-
dc.contributor.author	朱飞达	-
dc.date.accessioned	2019-03-01T03:44:44Z	-
dc.date.available	2019-03-01T03:44:44Z	-
dc.date.issued	2018	-
dc.identifier.citation	Zhu, F. [朱飞达]. (2018). Deep learning based stylization and smoothing for images and videos. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/267754	-
dc.description.abstract	In recent years, with the prosperity of digital imaging devices and social network, tremendous photographs have been recorded and sharing photos through the social media has been quite popular. Intelligent techniques of image analysis and enhancement have been increasingly important. Many efforts from the research and industry community have been made to push the development of state-of-the-art algorithms of automatic image and video stylization and edge-preserving image smoothing. Images and video stylization strives to enhance unique themes with artistic color and tone adjustments. Edge-preserving image smoothing is to preserve major image structures, such as salient edges and contours, while eliminating insignificant details. This thesis consists of novel deep learning based image and video stylization algorithms and a benchmark for edge-preserving image smoothing. For image and video stylization, mainstream photo enhancement softwares, such as Adobe Photoshop and Instagram, provide users with predefined styles. However, such photo adjustment tools lack a semantic understanding of image contents and the resulting global color transform limits the range of artistic styles it can represent. More advanced stylistic enhancement needs to apply distinct adjustments to various semantic regions. We propose a novel deep learning architecture for exemplar-based image stylization, which learns local enhancement styles from image pairs. Our deep learning architecture consists of fully convolutional networks (FCNs) for automatic semantics-aware feature extraction and fully connected neural layers for adjustment prediction. Image stylization can be efficiently accomplished with a single forward pass through our deep network. To extend our deep network from image stylization to video stylization, we exploit temporal superpixels (TSPs) to facilitate the transfer of artistic styles from image exemplars to videos. Experiments on a number of datasets for image stylization as well as a diverse set of video clips demonstrate the effectiveness of our deep learning architecture. For edge-preserving image smoothing, it is a fundamental problem in image processing and low-level computer vision. At present, there are reasons that seriously hinder its further development. First, its performance evaluation remains subjective. Second, there does not exist widely accepted datasets for both evaluation and research. Third, most existing algorithms cannot perform well on a wide range of image contents using a single parameter setting. To remove the aforementioned hurdles in performance evaluation and further advance the state of the art, we propose a benchmark for edge-preserving image smoothing. This benchmark includes an image dataset with ``groundtruth" image smoothing results as well as baseline learning algorithms that produce models capable of generating reasonable edge-preserving smoothing results for a wide range of image contents. Our image dataset contains 500 training and testing images with a large number of representative visual object categories. The baseline methods in our benchmark are existing representative deep convolutional network architectures, on top of which we design novel loss functions well suited for edge-preserving image smoothing. The trained deep networks run faster than most state-of-the-art smoothing algorithms while the smoothing performance of our ResNet-based model outperforms such algorithms both qualitatively and quantitatively.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Image processing - Digital techniques	-
dc.subject.lcsh	Video recordings - Editing - Digital techniques	-
dc.title	Deep learning based stylization and smoothing for images and videos	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Computer Science	-
dc.description.nature	published_or_final_version	-
dc.identifier.doi	10.5353/th_991044081524003414	-
dc.date.hkucongregation	2019	-
dc.identifier.mmsid	991044081524003414	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

postgraduate thesis: Deep learning based stylization and smoothing for images and videos

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats