File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Deep learning based stylization and smoothing for images and videos
Title | Deep learning based stylization and smoothing for images and videos |
---|---|
Authors | |
Advisors | Advisor(s):Yu, Y |
Issue Date | 2018 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Zhu, F. [朱飞达]. (2018). Deep learning based stylization and smoothing for images and videos. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | In recent years, with the prosperity of digital imaging devices and social network, tremendous photographs have been recorded and sharing photos through the social media has been quite popular. Intelligent techniques of image analysis and enhancement have been increasingly important. Many efforts from the research and industry community have been made to push the development of state-of-the-art algorithms of automatic image and video stylization and edge-preserving image smoothing. Images and video stylization strives to enhance unique themes with artistic color and tone adjustments. Edge-preserving image smoothing is to preserve major image structures, such as salient edges and contours, while eliminating insignificant details. This thesis consists of novel deep learning based image and video stylization algorithms and a benchmark for edge-preserving image smoothing.
For image and video stylization, mainstream photo enhancement softwares, such as Adobe Photoshop and Instagram, provide users with predefined styles. However, such photo adjustment tools lack a semantic understanding of image contents and the resulting global color transform limits the range of artistic styles it can represent. More advanced stylistic enhancement needs to apply distinct adjustments to various semantic regions. We propose a novel deep learning architecture for exemplar-based image stylization, which learns local enhancement styles from image pairs. Our deep learning architecture consists of fully convolutional networks (FCNs) for automatic semantics-aware feature extraction and fully connected neural layers for adjustment prediction. Image stylization can be efficiently accomplished with a single forward pass through our deep network. To extend our deep network from image stylization to video stylization, we exploit temporal superpixels (TSPs) to facilitate the transfer of artistic styles from image exemplars to videos. Experiments on a number of datasets for image stylization as well as a diverse set of video clips demonstrate the effectiveness of our deep learning architecture.
For edge-preserving image smoothing, it is a fundamental problem in image processing and low-level computer vision. At present, there are reasons that seriously hinder its further development. First, its performance evaluation remains subjective. Second, there does not exist widely accepted datasets for both evaluation and research. Third, most existing algorithms cannot perform well on a wide range of image contents using a single parameter setting. To remove the aforementioned hurdles in performance evaluation and further advance the state of the art, we propose a benchmark for edge-preserving image smoothing. This benchmark includes an image dataset with ``groundtruth" image smoothing results as well as baseline learning algorithms that produce models capable of generating reasonable edge-preserving smoothing results for a wide range of image contents. Our image dataset contains 500 training and testing images with a large number of representative visual object categories. The baseline methods in our benchmark are existing representative deep convolutional network architectures, on top of which we design novel loss functions well suited for edge-preserving image smoothing. The trained deep networks run faster than most state-of-the-art smoothing algorithms while the smoothing performance of our ResNet-based model outperforms such algorithms both qualitatively and quantitatively. |
Degree | Doctor of Philosophy |
Subject | Image processing - Digital techniques Video recordings - Editing - Digital techniques |
Dept/Program | Computer Science |
Persistent Identifier | http://hdl.handle.net/10722/267754 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Yu, Y | - |
dc.contributor.author | Zhu, Feida | - |
dc.contributor.author | 朱飞达 | - |
dc.date.accessioned | 2019-03-01T03:44:44Z | - |
dc.date.available | 2019-03-01T03:44:44Z | - |
dc.date.issued | 2018 | - |
dc.identifier.citation | Zhu, F. [朱飞达]. (2018). Deep learning based stylization and smoothing for images and videos. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/267754 | - |
dc.description.abstract | In recent years, with the prosperity of digital imaging devices and social network, tremendous photographs have been recorded and sharing photos through the social media has been quite popular. Intelligent techniques of image analysis and enhancement have been increasingly important. Many efforts from the research and industry community have been made to push the development of state-of-the-art algorithms of automatic image and video stylization and edge-preserving image smoothing. Images and video stylization strives to enhance unique themes with artistic color and tone adjustments. Edge-preserving image smoothing is to preserve major image structures, such as salient edges and contours, while eliminating insignificant details. This thesis consists of novel deep learning based image and video stylization algorithms and a benchmark for edge-preserving image smoothing. For image and video stylization, mainstream photo enhancement softwares, such as Adobe Photoshop and Instagram, provide users with predefined styles. However, such photo adjustment tools lack a semantic understanding of image contents and the resulting global color transform limits the range of artistic styles it can represent. More advanced stylistic enhancement needs to apply distinct adjustments to various semantic regions. We propose a novel deep learning architecture for exemplar-based image stylization, which learns local enhancement styles from image pairs. Our deep learning architecture consists of fully convolutional networks (FCNs) for automatic semantics-aware feature extraction and fully connected neural layers for adjustment prediction. Image stylization can be efficiently accomplished with a single forward pass through our deep network. To extend our deep network from image stylization to video stylization, we exploit temporal superpixels (TSPs) to facilitate the transfer of artistic styles from image exemplars to videos. Experiments on a number of datasets for image stylization as well as a diverse set of video clips demonstrate the effectiveness of our deep learning architecture. For edge-preserving image smoothing, it is a fundamental problem in image processing and low-level computer vision. At present, there are reasons that seriously hinder its further development. First, its performance evaluation remains subjective. Second, there does not exist widely accepted datasets for both evaluation and research. Third, most existing algorithms cannot perform well on a wide range of image contents using a single parameter setting. To remove the aforementioned hurdles in performance evaluation and further advance the state of the art, we propose a benchmark for edge-preserving image smoothing. This benchmark includes an image dataset with ``groundtruth" image smoothing results as well as baseline learning algorithms that produce models capable of generating reasonable edge-preserving smoothing results for a wide range of image contents. Our image dataset contains 500 training and testing images with a large number of representative visual object categories. The baseline methods in our benchmark are existing representative deep convolutional network architectures, on top of which we design novel loss functions well suited for edge-preserving image smoothing. The trained deep networks run faster than most state-of-the-art smoothing algorithms while the smoothing performance of our ResNet-based model outperforms such algorithms both qualitatively and quantitatively. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Image processing - Digital techniques | - |
dc.subject.lcsh | Video recordings - Editing - Digital techniques | - |
dc.title | Deep learning based stylization and smoothing for images and videos | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Computer Science | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.5353/th_991044081524003414 | - |
dc.date.hkucongregation | 2019 | - |
dc.identifier.mmsid | 991044081524003414 | - |