File Download
Supplementary

postgraduate thesis: Novel techniques for visual object tracking and depth-aware video processing

TitleNovel techniques for visual object tracking and depth-aware video processing
Authors
Issue Date2015
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Zhang, S. [張帥]. (2015). Novel techniques for visual object tracking and depth-aware video processing. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5576785
AbstractVisual object tracking is frequently employed in applications, such as intelligent video surveillance, human body tracking, and many other related problems. Therefore, it is a fundamental problem in video processing and computer vision. The general procedure of automatic object tracking consists of object detection, object representation, tracking strategy and model updating. Tracking strategy, in particular, is an important component because it performs prediction and inference of useful object information such as object location, object orientation and object size, from one frame to another. In this dissertation, a new visual object tracking algorithm using a novel Bayesian Kalman filter (BKF) with simplified Gaussian mixture (BKF-SGM) tracking strategy is proposed. The new BKF-SGM employs a GM representation of the state and noise densities and a novel direct density simplifying algorithm for avoiding the exponential complexity growth of conventional Kalman filters using GM. As the GM is simplified directly without resampling using particles, the proposed BKF-SGM considerably reduces the exponential arithmetic complexity and avoids performance degradation due to sampling degeneracy and impoverishment in conventional particle filtering (PF). When coupled with an improved mean shift (MS) algorithm, the original MS tracker is extended under the BKF-SGM framework above to a bank of parallel MS trackers, which offer a more robust tracking performance. The resultant algorithm, which is called the BKF-SGM with improved MS (BKF-SGM-IMS), is inherently parallel in nature and hence can be readily accelerated using Graphics Processing Unit (GPU) to meet the high computational requirement in real-time applications. The proposed BKF-SGM-IMS algorithm can successfully handle complex scenarios with good performance and low arithmetic complexity. Moreover, the performance of both non-training/training-based object recognition algorithms can be improved by using our tracking results as input. As depth information make machine vision one step closer to human vision by combining color and depth information, there is a recent interest in depth-aware video processing and computer vision both in the academic and industrial fields. However, high quality and high resolution depth map acquisition for real world scene is a challenging problem. Conventional depth acquisition algorithms which rely on stereo/multi-view vision (passive method) or depth sensing device (active method) alone are limited by complicated scenes or imperfections of the depth sensing devices. In this dissertation, a new system for indoor high resolution and high quality depth estimation using joint fusion of stereo and depth sensing data is proposed. By modeling the observations using Markov random field (MRF), the fusion problem is formulated as a maximum a posteriori probability (MAP) estimation problem. The reliability and the probability density functions for describing the observations from the two devices are also derived. The MAP problem is solved using a multi-scale belief propagation (BP) algorithm. To suppress possible estimation noise, the depth map estimated is further refined by color image guided depth matting and a 2D polynomial regression (LPR)-based filtering. Experimental results and numerical comparisons show that our system can provide high quality and high resolution depth maps, thanks to the complementary strengths of both stereo vision and depth sensing device.
DegreeDoctor of Philosophy
SubjectImage processing - Digital techniques
Automatic tracking
Dept/ProgramElectrical and Electronic Engineering
Persistent Identifierhttp://hdl.handle.net/10722/221087

 

DC FieldValueLanguage
dc.contributor.authorZhang, Shuai-
dc.contributor.author張帥-
dc.date.accessioned2015-10-26T23:11:57Z-
dc.date.available2015-10-26T23:11:57Z-
dc.date.issued2015-
dc.identifier.citationZhang, S. [張帥]. (2015). Novel techniques for visual object tracking and depth-aware video processing. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5576785-
dc.identifier.urihttp://hdl.handle.net/10722/221087-
dc.description.abstractVisual object tracking is frequently employed in applications, such as intelligent video surveillance, human body tracking, and many other related problems. Therefore, it is a fundamental problem in video processing and computer vision. The general procedure of automatic object tracking consists of object detection, object representation, tracking strategy and model updating. Tracking strategy, in particular, is an important component because it performs prediction and inference of useful object information such as object location, object orientation and object size, from one frame to another. In this dissertation, a new visual object tracking algorithm using a novel Bayesian Kalman filter (BKF) with simplified Gaussian mixture (BKF-SGM) tracking strategy is proposed. The new BKF-SGM employs a GM representation of the state and noise densities and a novel direct density simplifying algorithm for avoiding the exponential complexity growth of conventional Kalman filters using GM. As the GM is simplified directly without resampling using particles, the proposed BKF-SGM considerably reduces the exponential arithmetic complexity and avoids performance degradation due to sampling degeneracy and impoverishment in conventional particle filtering (PF). When coupled with an improved mean shift (MS) algorithm, the original MS tracker is extended under the BKF-SGM framework above to a bank of parallel MS trackers, which offer a more robust tracking performance. The resultant algorithm, which is called the BKF-SGM with improved MS (BKF-SGM-IMS), is inherently parallel in nature and hence can be readily accelerated using Graphics Processing Unit (GPU) to meet the high computational requirement in real-time applications. The proposed BKF-SGM-IMS algorithm can successfully handle complex scenarios with good performance and low arithmetic complexity. Moreover, the performance of both non-training/training-based object recognition algorithms can be improved by using our tracking results as input. As depth information make machine vision one step closer to human vision by combining color and depth information, there is a recent interest in depth-aware video processing and computer vision both in the academic and industrial fields. However, high quality and high resolution depth map acquisition for real world scene is a challenging problem. Conventional depth acquisition algorithms which rely on stereo/multi-view vision (passive method) or depth sensing device (active method) alone are limited by complicated scenes or imperfections of the depth sensing devices. In this dissertation, a new system for indoor high resolution and high quality depth estimation using joint fusion of stereo and depth sensing data is proposed. By modeling the observations using Markov random field (MRF), the fusion problem is formulated as a maximum a posteriori probability (MAP) estimation problem. The reliability and the probability density functions for describing the observations from the two devices are also derived. The MAP problem is solved using a multi-scale belief propagation (BP) algorithm. To suppress possible estimation noise, the depth map estimated is further refined by color image guided depth matting and a 2D polynomial regression (LPR)-based filtering. Experimental results and numerical comparisons show that our system can provide high quality and high resolution depth maps, thanks to the complementary strengths of both stereo vision and depth sensing device.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsCreative Commons: Attribution 3.0 Hong Kong License-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.subject.lcshImage processing - Digital techniques-
dc.subject.lcshAutomatic tracking-
dc.titleNovel techniques for visual object tracking and depth-aware video processing-
dc.typePG_Thesis-
dc.identifier.hkulb5576785-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineElectrical and Electronic Engineering-
dc.description.naturepublished_or_final_version-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats