File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

postgraduate thesis: Deep learning based stylization and smoothing for images and videos

TitleDeep learning based stylization and smoothing for images and videos
Authors
Advisors
Advisor(s):Yu, Y
Issue Date2018
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Zhu, F. [朱飞达]. (2018). Deep learning based stylization and smoothing for images and videos. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractIn recent years, with the prosperity of digital imaging devices and social network, tremendous photographs have been recorded and sharing photos through the social media has been quite popular. Intelligent techniques of image analysis and enhancement have been increasingly important. Many efforts from the research and industry community have been made to push the development of state-of-the-art algorithms of automatic image and video stylization and edge-preserving image smoothing. Images and video stylization strives to enhance unique themes with artistic color and tone adjustments. Edge-preserving image smoothing is to preserve major image structures, such as salient edges and contours, while eliminating insignificant details. This thesis consists of novel deep learning based image and video stylization algorithms and a benchmark for edge-preserving image smoothing. For image and video stylization, mainstream photo enhancement softwares, such as Adobe Photoshop and Instagram, provide users with predefined styles. However, such photo adjustment tools lack a semantic understanding of image contents and the resulting global color transform limits the range of artistic styles it can represent. More advanced stylistic enhancement needs to apply distinct adjustments to various semantic regions. We propose a novel deep learning architecture for exemplar-based image stylization, which learns local enhancement styles from image pairs. Our deep learning architecture consists of fully convolutional networks (FCNs) for automatic semantics-aware feature extraction and fully connected neural layers for adjustment prediction. Image stylization can be efficiently accomplished with a single forward pass through our deep network. To extend our deep network from image stylization to video stylization, we exploit temporal superpixels (TSPs) to facilitate the transfer of artistic styles from image exemplars to videos. Experiments on a number of datasets for image stylization as well as a diverse set of video clips demonstrate the effectiveness of our deep learning architecture. For edge-preserving image smoothing, it is a fundamental problem in image processing and low-level computer vision. At present, there are reasons that seriously hinder its further development. First, its performance evaluation remains subjective. Second, there does not exist widely accepted datasets for both evaluation and research. Third, most existing algorithms cannot perform well on a wide range of image contents using a single parameter setting. To remove the aforementioned hurdles in performance evaluation and further advance the state of the art, we propose a benchmark for edge-preserving image smoothing. This benchmark includes an image dataset with ``groundtruth" image smoothing results as well as baseline learning algorithms that produce models capable of generating reasonable edge-preserving smoothing results for a wide range of image contents. Our image dataset contains 500 training and testing images with a large number of representative visual object categories. The baseline methods in our benchmark are existing representative deep convolutional network architectures, on top of which we design novel loss functions well suited for edge-preserving image smoothing. The trained deep networks run faster than most state-of-the-art smoothing algorithms while the smoothing performance of our ResNet-based model outperforms such algorithms both qualitatively and quantitatively.
DegreeDoctor of Philosophy
SubjectImage processing - Digital techniques
Video recordings - Editing - Digital techniques
Dept/ProgramComputer Science
Persistent Identifierhttp://hdl.handle.net/10722/267754

 

DC FieldValueLanguage
dc.contributor.advisorYu, Y-
dc.contributor.authorZhu, Feida-
dc.contributor.author朱飞达-
dc.date.accessioned2019-03-01T03:44:44Z-
dc.date.available2019-03-01T03:44:44Z-
dc.date.issued2018-
dc.identifier.citationZhu, F. [朱飞达]. (2018). Deep learning based stylization and smoothing for images and videos. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/267754-
dc.description.abstractIn recent years, with the prosperity of digital imaging devices and social network, tremendous photographs have been recorded and sharing photos through the social media has been quite popular. Intelligent techniques of image analysis and enhancement have been increasingly important. Many efforts from the research and industry community have been made to push the development of state-of-the-art algorithms of automatic image and video stylization and edge-preserving image smoothing. Images and video stylization strives to enhance unique themes with artistic color and tone adjustments. Edge-preserving image smoothing is to preserve major image structures, such as salient edges and contours, while eliminating insignificant details. This thesis consists of novel deep learning based image and video stylization algorithms and a benchmark for edge-preserving image smoothing. For image and video stylization, mainstream photo enhancement softwares, such as Adobe Photoshop and Instagram, provide users with predefined styles. However, such photo adjustment tools lack a semantic understanding of image contents and the resulting global color transform limits the range of artistic styles it can represent. More advanced stylistic enhancement needs to apply distinct adjustments to various semantic regions. We propose a novel deep learning architecture for exemplar-based image stylization, which learns local enhancement styles from image pairs. Our deep learning architecture consists of fully convolutional networks (FCNs) for automatic semantics-aware feature extraction and fully connected neural layers for adjustment prediction. Image stylization can be efficiently accomplished with a single forward pass through our deep network. To extend our deep network from image stylization to video stylization, we exploit temporal superpixels (TSPs) to facilitate the transfer of artistic styles from image exemplars to videos. Experiments on a number of datasets for image stylization as well as a diverse set of video clips demonstrate the effectiveness of our deep learning architecture. For edge-preserving image smoothing, it is a fundamental problem in image processing and low-level computer vision. At present, there are reasons that seriously hinder its further development. First, its performance evaluation remains subjective. Second, there does not exist widely accepted datasets for both evaluation and research. Third, most existing algorithms cannot perform well on a wide range of image contents using a single parameter setting. To remove the aforementioned hurdles in performance evaluation and further advance the state of the art, we propose a benchmark for edge-preserving image smoothing. This benchmark includes an image dataset with ``groundtruth" image smoothing results as well as baseline learning algorithms that produce models capable of generating reasonable edge-preserving smoothing results for a wide range of image contents. Our image dataset contains 500 training and testing images with a large number of representative visual object categories. The baseline methods in our benchmark are existing representative deep convolutional network architectures, on top of which we design novel loss functions well suited for edge-preserving image smoothing. The trained deep networks run faster than most state-of-the-art smoothing algorithms while the smoothing performance of our ResNet-based model outperforms such algorithms both qualitatively and quantitatively.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshImage processing - Digital techniques-
dc.subject.lcshVideo recordings - Editing - Digital techniques-
dc.titleDeep learning based stylization and smoothing for images and videos-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineComputer Science-
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.5353/th_991044081524003414-
dc.date.hkucongregation2019-
dc.identifier.mmsid991044081524003414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats