Novel techniques for visual object tracking and depth-aware video processing

Zhang, Shuai; 張帥

File Download

FullText.pdf

Links for fulltext

(May Require Subscription)

DOI: 10.5353/th_b5576785

Supplementary

Citations:
Appears in Collections:
- Electrical & Electronic Engineering: Theses
- HKU Theses Online

postgraduate thesis: Novel techniques for visual object tracking and depth-aware video processing

Title	Novel techniques for visual object tracking and depth-aware video processing
Authors	Zhang, Shuai 張帥
Issue Date	2015
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Zhang, S. [張帥]. (2015). Novel techniques for visual object tracking and depth-aware video processing. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5576785
Abstract	Visual object tracking is frequently employed in applications, such as intelligent video surveillance, human body tracking, and many other related problems. Therefore, it is a fundamental problem in video processing and computer vision. The general procedure of automatic object tracking consists of object detection, object representation, tracking strategy and model updating. Tracking strategy, in particular, is an important component because it performs prediction and inference of useful object information such as object location, object orientation and object size, from one frame to another. In this dissertation, a new visual object tracking algorithm using a novel Bayesian Kalman filter (BKF) with simplified Gaussian mixture (BKF-SGM) tracking strategy is proposed. The new BKF-SGM employs a GM representation of the state and noise densities and a novel direct density simplifying algorithm for avoiding the exponential complexity growth of conventional Kalman filters using GM. As the GM is simplified directly without resampling using particles, the proposed BKF-SGM considerably reduces the exponential arithmetic complexity and avoids performance degradation due to sampling degeneracy and impoverishment in conventional particle filtering (PF). When coupled with an improved mean shift (MS) algorithm, the original MS tracker is extended under the BKF-SGM framework above to a bank of parallel MS trackers, which offer a more robust tracking performance. The resultant algorithm, which is called the BKF-SGM with improved MS (BKF-SGM-IMS), is inherently parallel in nature and hence can be readily accelerated using Graphics Processing Unit (GPU) to meet the high computational requirement in real-time applications. The proposed BKF-SGM-IMS algorithm can successfully handle complex scenarios with good performance and low arithmetic complexity. Moreover, the performance of both non-training/training-based object recognition algorithms can be improved by using our tracking results as input. As depth information make machine vision one step closer to human vision by combining color and depth information, there is a recent interest in depth-aware video processing and computer vision both in the academic and industrial fields. However, high quality and high resolution depth map acquisition for real world scene is a challenging problem. Conventional depth acquisition algorithms which rely on stereo/multi-view vision (passive method) or depth sensing device (active method) alone are limited by complicated scenes or imperfections of the depth sensing devices. In this dissertation, a new system for indoor high resolution and high quality depth estimation using joint fusion of stereo and depth sensing data is proposed. By modeling the observations using Markov random field (MRF), the fusion problem is formulated as a maximum a posteriori probability (MAP) estimation problem. The reliability and the probability density functions for describing the observations from the two devices are also derived. The MAP problem is solved using a multi-scale belief propagation (BP) algorithm. To suppress possible estimation noise, the depth map estimated is further refined by color image guided depth matting and a 2D polynomial regression (LPR)-based filtering. Experimental results and numerical comparisons show that our system can provide high quality and high resolution depth maps, thanks to the complementary strengths of both stereo vision and depth sensing device.
Degree	Doctor of Philosophy
Subject	Image processing - Digital techniques Automatic tracking
Dept/Program	Electrical and Electronic Engineering
Persistent Identifier	http://hdl.handle.net/10722/221087
HKU Library Item ID	b5576785

DC Field	Value	Language
dc.contributor.author	Zhang, Shuai	-
dc.contributor.author	張帥	-
dc.date.accessioned	2015-10-26T23:11:57Z	-
dc.date.available	2015-10-26T23:11:57Z	-
dc.date.issued	2015	-
dc.identifier.citation	Zhang, S. [張帥]. (2015). Novel techniques for visual object tracking and depth-aware video processing. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5576785	-
dc.identifier.uri	http://hdl.handle.net/10722/221087	-
dc.description.abstract	Visual object tracking is frequently employed in applications, such as intelligent video surveillance, human body tracking, and many other related problems. Therefore, it is a fundamental problem in video processing and computer vision. The general procedure of automatic object tracking consists of object detection, object representation, tracking strategy and model updating. Tracking strategy, in particular, is an important component because it performs prediction and inference of useful object information such as object location, object orientation and object size, from one frame to another. In this dissertation, a new visual object tracking algorithm using a novel Bayesian Kalman filter (BKF) with simplified Gaussian mixture (BKF-SGM) tracking strategy is proposed. The new BKF-SGM employs a GM representation of the state and noise densities and a novel direct density simplifying algorithm for avoiding the exponential complexity growth of conventional Kalman filters using GM. As the GM is simplified directly without resampling using particles, the proposed BKF-SGM considerably reduces the exponential arithmetic complexity and avoids performance degradation due to sampling degeneracy and impoverishment in conventional particle filtering (PF). When coupled with an improved mean shift (MS) algorithm, the original MS tracker is extended under the BKF-SGM framework above to a bank of parallel MS trackers, which offer a more robust tracking performance. The resultant algorithm, which is called the BKF-SGM with improved MS (BKF-SGM-IMS), is inherently parallel in nature and hence can be readily accelerated using Graphics Processing Unit (GPU) to meet the high computational requirement in real-time applications. The proposed BKF-SGM-IMS algorithm can successfully handle complex scenarios with good performance and low arithmetic complexity. Moreover, the performance of both non-training/training-based object recognition algorithms can be improved by using our tracking results as input. As depth information make machine vision one step closer to human vision by combining color and depth information, there is a recent interest in depth-aware video processing and computer vision both in the academic and industrial fields. However, high quality and high resolution depth map acquisition for real world scene is a challenging problem. Conventional depth acquisition algorithms which rely on stereo/multi-view vision (passive method) or depth sensing device (active method) alone are limited by complicated scenes or imperfections of the depth sensing devices. In this dissertation, a new system for indoor high resolution and high quality depth estimation using joint fusion of stereo and depth sensing data is proposed. By modeling the observations using Markov random field (MRF), the fusion problem is formulated as a maximum a posteriori probability (MAP) estimation problem. The reliability and the probability density functions for describing the observations from the two devices are also derived. The MAP problem is solved using a multi-scale belief propagation (BP) algorithm. To suppress possible estimation noise, the depth map estimated is further refined by color image guided depth matting and a 2D polynomial regression (LPR)-based filtering. Experimental results and numerical comparisons show that our system can provide high quality and high resolution depth maps, thanks to the complementary strengths of both stereo vision and depth sensing device.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.subject.lcsh	Image processing - Digital techniques	-
dc.subject.lcsh	Automatic tracking	-
dc.title	Novel techniques for visual object tracking and depth-aware video processing	-
dc.type	PG_Thesis	-
dc.identifier.hkul	b5576785	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Electrical and Electronic Engineering	-
dc.description.nature	published_or_final_version	-
dc.identifier.doi	10.5353/th_b5576785	-
dc.identifier.mmsid	991011257019703414	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

postgraduate thesis: Novel techniques for visual object tracking and depth-aware video processing

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats