File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Low-level vision processing : new approaches and sensors
Title | Low-level vision processing : new approaches and sensors |
---|---|
Authors | |
Issue Date | 2023 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Wang, Z. [王州霞]. (2023). Low-level vision processing : new approaches and sensors. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Low-level Vision processing aims to pixel-wisely process low-quality vision data, such as images and videos, to attain their high-quality ones. Low-level vision processing is complex since it contains a wide variety of low-quality data and involves many scenarios. In this thesis, we study low-level vision processing in three kinds of scenarios: scenarios with human faces only, natural scenarios, and an extremely challenging scenario. For each scenario, we delicately design a corresponding approach according to the property of the unprocessed data and scenarios to attain high-quality processing results.
Our studies of scenarios with human faces only mainly focus on blind face restoration. First, we propose a RestoreFormer++ for blind face image restoration. It introduces fully-spatial attention mechanisms to model the contextual information and the interplay with the priors, achieving high-quality face images with both realness and fidelity. Its priors are matched from a learned reconstruction-oriented high-quality dictionary which is more accordant to the face restoration task, leading to rich details in the restored face images. Moreover, it is more robust and general to real-world degradation since its well-designed extending degrading model alleviates the synthetic-to-real-world gap. Then, we extend our study to face video restoration. We systematically analyze the potential benefits and difficulties posed by current face image restoration algorithms when extended to real-world face video restoration and provide a viable solution to mitigate the analyzed difficulties.
Our study of natural scenarios is image deblurring. In this work, we introduce an event-based vision sensor, which can detect per-pixel brightness changes in microsecond resolution. Considering the complementary between the intensity images captured with a frame-based camera and event data captured with an event camera in temporal and spatial aspects, we propose to alternately enhance the quality of intensity image and even data with a DeblurNet and EventSRNet.
In addition, we study an extremely challenging scenario, whose dynamic range is extremely high, and focus on exposure bracketing selection. Exposure bracketing selection aims to predict a sequence of images captured in different exposure times to attain high dynamic range images. Our proposed exposure bracketing selection network (EBSNet) makes decisions according to the illumination distribution and semantic information extracted from only one auto-exposure preview image, releasing itself from a series of restrictions, such as camera response function and sensor noise model. EBSNet is learned with reinforcement learning and rewarded with a multi-exposure fusion network (MEFNet) used for fusing the images captured under the exposure time predicted by EBSNet. Joint training of EBSNet and MEFNet can improve the accuracy of exposure bracketing selection and the quality of multi-exposure fusion.
We have conducted experiments to evaluate the effectiveness of our proposed approaches and, to shed light on the development of low-level vision processing, we provide a real-world low-quality face video benchmark and an exposure bracketing selection benchmark.
|
Degree | Doctor of Philosophy |
Subject | Image processing - Data processing |
Dept/Program | Computer Science |
Persistent Identifier | http://hdl.handle.net/10722/335162 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Wang, Zhouxia | - |
dc.contributor.author | 王州霞 | - |
dc.date.accessioned | 2023-11-13T07:45:05Z | - |
dc.date.available | 2023-11-13T07:45:05Z | - |
dc.date.issued | 2023 | - |
dc.identifier.citation | Wang, Z. [王州霞]. (2023). Low-level vision processing : new approaches and sensors. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/335162 | - |
dc.description.abstract | Low-level Vision processing aims to pixel-wisely process low-quality vision data, such as images and videos, to attain their high-quality ones. Low-level vision processing is complex since it contains a wide variety of low-quality data and involves many scenarios. In this thesis, we study low-level vision processing in three kinds of scenarios: scenarios with human faces only, natural scenarios, and an extremely challenging scenario. For each scenario, we delicately design a corresponding approach according to the property of the unprocessed data and scenarios to attain high-quality processing results. Our studies of scenarios with human faces only mainly focus on blind face restoration. First, we propose a RestoreFormer++ for blind face image restoration. It introduces fully-spatial attention mechanisms to model the contextual information and the interplay with the priors, achieving high-quality face images with both realness and fidelity. Its priors are matched from a learned reconstruction-oriented high-quality dictionary which is more accordant to the face restoration task, leading to rich details in the restored face images. Moreover, it is more robust and general to real-world degradation since its well-designed extending degrading model alleviates the synthetic-to-real-world gap. Then, we extend our study to face video restoration. We systematically analyze the potential benefits and difficulties posed by current face image restoration algorithms when extended to real-world face video restoration and provide a viable solution to mitigate the analyzed difficulties. Our study of natural scenarios is image deblurring. In this work, we introduce an event-based vision sensor, which can detect per-pixel brightness changes in microsecond resolution. Considering the complementary between the intensity images captured with a frame-based camera and event data captured with an event camera in temporal and spatial aspects, we propose to alternately enhance the quality of intensity image and even data with a DeblurNet and EventSRNet. In addition, we study an extremely challenging scenario, whose dynamic range is extremely high, and focus on exposure bracketing selection. Exposure bracketing selection aims to predict a sequence of images captured in different exposure times to attain high dynamic range images. Our proposed exposure bracketing selection network (EBSNet) makes decisions according to the illumination distribution and semantic information extracted from only one auto-exposure preview image, releasing itself from a series of restrictions, such as camera response function and sensor noise model. EBSNet is learned with reinforcement learning and rewarded with a multi-exposure fusion network (MEFNet) used for fusing the images captured under the exposure time predicted by EBSNet. Joint training of EBSNet and MEFNet can improve the accuracy of exposure bracketing selection and the quality of multi-exposure fusion. We have conducted experiments to evaluate the effectiveness of our proposed approaches and, to shed light on the development of low-level vision processing, we provide a real-world low-quality face video benchmark and an exposure bracketing selection benchmark. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Image processing - Data processing | - |
dc.title | Low-level vision processing : new approaches and sensors | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Computer Science | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2023 | - |
dc.identifier.mmsid | 991044736606203414 | - |