Low-level vision processing : new approaches and sensors

Wang, Zhouxia; 王州霞

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Computer Science: Theses

postgraduate thesis: Low-level vision processing : new approaches and sensors

Title	Low-level vision processing : new approaches and sensors
Authors	Wang, Zhouxia 王州霞
Issue Date	2023
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Wang, Z. [王州霞]. (2023). Low-level vision processing : new approaches and sensors. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Low-level Vision processing aims to pixel-wisely process low-quality vision data, such as images and videos, to attain their high-quality ones. Low-level vision processing is complex since it contains a wide variety of low-quality data and involves many scenarios. In this thesis, we study low-level vision processing in three kinds of scenarios: scenarios with human faces only, natural scenarios, and an extremely challenging scenario. For each scenario, we delicately design a corresponding approach according to the property of the unprocessed data and scenarios to attain high-quality processing results. Our studies of scenarios with human faces only mainly focus on blind face restoration. First, we propose a RestoreFormer++ for blind face image restoration. It introduces fully-spatial attention mechanisms to model the contextual information and the interplay with the priors, achieving high-quality face images with both realness and fidelity. Its priors are matched from a learned reconstruction-oriented high-quality dictionary which is more accordant to the face restoration task, leading to rich details in the restored face images. Moreover, it is more robust and general to real-world degradation since its well-designed extending degrading model alleviates the synthetic-to-real-world gap. Then, we extend our study to face video restoration. We systematically analyze the potential benefits and difficulties posed by current face image restoration algorithms when extended to real-world face video restoration and provide a viable solution to mitigate the analyzed difficulties. Our study of natural scenarios is image deblurring. In this work, we introduce an event-based vision sensor, which can detect per-pixel brightness changes in microsecond resolution. Considering the complementary between the intensity images captured with a frame-based camera and event data captured with an event camera in temporal and spatial aspects, we propose to alternately enhance the quality of intensity image and even data with a DeblurNet and EventSRNet. In addition, we study an extremely challenging scenario, whose dynamic range is extremely high, and focus on exposure bracketing selection. Exposure bracketing selection aims to predict a sequence of images captured in different exposure times to attain high dynamic range images. Our proposed exposure bracketing selection network (EBSNet) makes decisions according to the illumination distribution and semantic information extracted from only one auto-exposure preview image, releasing itself from a series of restrictions, such as camera response function and sensor noise model. EBSNet is learned with reinforcement learning and rewarded with a multi-exposure fusion network (MEFNet) used for fusing the images captured under the exposure time predicted by EBSNet. Joint training of EBSNet and MEFNet can improve the accuracy of exposure bracketing selection and the quality of multi-exposure fusion. We have conducted experiments to evaluate the effectiveness of our proposed approaches and, to shed light on the development of low-level vision processing, we provide a real-world low-quality face video benchmark and an exposure bracketing selection benchmark.
Degree	Doctor of Philosophy
Subject	Image processing - Data processing
Dept/Program	Computer Science
Persistent Identifier	http://hdl.handle.net/10722/335162

DC Field	Value	Language
dc.contributor.author	Wang, Zhouxia	-
dc.contributor.author	王州霞	-
dc.date.accessioned	2023-11-13T07:45:05Z	-
dc.date.available	2023-11-13T07:45:05Z	-
dc.date.issued	2023	-
dc.identifier.citation	Wang, Z. [王州霞]. (2023). Low-level vision processing : new approaches and sensors. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/335162	-
dc.description.abstract	Low-level Vision processing aims to pixel-wisely process low-quality vision data, such as images and videos, to attain their high-quality ones. Low-level vision processing is complex since it contains a wide variety of low-quality data and involves many scenarios. In this thesis, we study low-level vision processing in three kinds of scenarios: scenarios with human faces only, natural scenarios, and an extremely challenging scenario. For each scenario, we delicately design a corresponding approach according to the property of the unprocessed data and scenarios to attain high-quality processing results. Our studies of scenarios with human faces only mainly focus on blind face restoration. First, we propose a RestoreFormer++ for blind face image restoration. It introduces fully-spatial attention mechanisms to model the contextual information and the interplay with the priors, achieving high-quality face images with both realness and fidelity. Its priors are matched from a learned reconstruction-oriented high-quality dictionary which is more accordant to the face restoration task, leading to rich details in the restored face images. Moreover, it is more robust and general to real-world degradation since its well-designed extending degrading model alleviates the synthetic-to-real-world gap. Then, we extend our study to face video restoration. We systematically analyze the potential benefits and difficulties posed by current face image restoration algorithms when extended to real-world face video restoration and provide a viable solution to mitigate the analyzed difficulties. Our study of natural scenarios is image deblurring. In this work, we introduce an event-based vision sensor, which can detect per-pixel brightness changes in microsecond resolution. Considering the complementary between the intensity images captured with a frame-based camera and event data captured with an event camera in temporal and spatial aspects, we propose to alternately enhance the quality of intensity image and even data with a DeblurNet and EventSRNet. In addition, we study an extremely challenging scenario, whose dynamic range is extremely high, and focus on exposure bracketing selection. Exposure bracketing selection aims to predict a sequence of images captured in different exposure times to attain high dynamic range images. Our proposed exposure bracketing selection network (EBSNet) makes decisions according to the illumination distribution and semantic information extracted from only one auto-exposure preview image, releasing itself from a series of restrictions, such as camera response function and sensor noise model. EBSNet is learned with reinforcement learning and rewarded with a multi-exposure fusion network (MEFNet) used for fusing the images captured under the exposure time predicted by EBSNet. Joint training of EBSNet and MEFNet can improve the accuracy of exposure bracketing selection and the quality of multi-exposure fusion. We have conducted experiments to evaluate the effectiveness of our proposed approaches and, to shed light on the development of low-level vision processing, we provide a real-world low-quality face video benchmark and an exposure bracketing selection benchmark.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Image processing - Data processing	-
dc.title	Low-level vision processing : new approaches and sensors	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Computer Science	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2023	-
dc.identifier.mmsid	991044736606203414	-

File Download

Supplementary

postgraduate thesis: Low-level vision processing : new approaches and sensors

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats