Learning structural SVMs and its applications in computer vision

Kuang, Zhanghui; 旷章辉

File Download

FullText.pdf

Links for fulltext

(May Require Subscription)

DOI: 10.5353/th_b5223970

Supplementary

Citations:
Appears in Collections:
- Computer Science: Theses
- HKU Theses Online

postgraduate thesis: Learning structural SVMs and its applications in computer vision

Title	Learning structural SVMs and its applications in computer vision
Authors	Kuang, Zhanghui 旷章辉
Advisors	Advisor(s):Wong, KKY
Issue Date	2014
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Kuang, Z. [旷章辉]. (2014). Learning structural SVMs and its applications in computer vision. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5223970
Abstract	Many computer vision problems involve building automatic systems by extracting complex high-level information from visual data. Such problems can often be modeled using structural models, which relate raw input variables to structural high-level output variables. Structural support vector machine is a discriminative method for learning structural models. It allows a flexible feature construction with good robustness against overfitting, and thus provides state-of-the-art prediction accuracies for structural prediction tasks in computer vision. This thesis first studies the application of structural SVMs in interactive image segmentation. A novel interactive image segmentation technique that automatically learns segmentation parameters tailored for each and every image is proposed. Unlike existing work, the proposed method does not require any offline parameter tuning or training stage, and is capable of determining image-specific parameters according to some simple user interactions with the target image. The segmentation problem is modeled as an inference of a conditional random field (CRF) over a segmentation mask and the target image. This CRF is parametrized by the weights for different terms (e.g., color, texture and smoothing). These weight parameters are learned via a one-slack structural SVM, which is solved using a constraint approximation scheme and the cutting plane algorithm. Experimental results show that the proposed method, by learning image-specific parameters automatically, outperforms other state-of-the-art interactive image segmentation techniques. This thesis then uses structural SVMs to speed up large scale relatively-paired space analysis. A new multi-modality analysis technique based on relatively-paired observations from multiple modalities is proposed. Relative-pairing information is encoded using relative proximities of observations in a latent common space. By building a discriminative model and maximizing a distance margin, a projection function that maps observations into the latent common space is learned for each modality. However, training based on large scale relatively-paired observations could be extremely time consuming. To this end, the training is reformulated as learning a structural model, which can be optimized by the cutting plane algorithm where only a few training samples are involved in each iteration. Experimental results validate the effectiveness and efficiency of the proposed technique.
Degree	Doctor of Philosophy
Subject	Support vector machines Computer vision
Dept/Program	Computer Science
Persistent Identifier	http://hdl.handle.net/10722/206663
HKU Library Item ID	b5223970

DC Field	Value	Language
dc.contributor.advisor	Wong, KKY	-
dc.contributor.author	Kuang, Zhanghui	-
dc.contributor.author	旷章辉	-
dc.date.accessioned	2014-11-25T03:53:14Z	-
dc.date.available	2014-11-25T03:53:14Z	-
dc.date.issued	2014	-
dc.identifier.citation	Kuang, Z. [旷章辉]. (2014). Learning structural SVMs and its applications in computer vision. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5223970	-
dc.identifier.uri	http://hdl.handle.net/10722/206663	-
dc.description.abstract	Many computer vision problems involve building automatic systems by extracting complex high-level information from visual data. Such problems can often be modeled using structural models, which relate raw input variables to structural high-level output variables. Structural support vector machine is a discriminative method for learning structural models. It allows a flexible feature construction with good robustness against overfitting, and thus provides state-of-the-art prediction accuracies for structural prediction tasks in computer vision. This thesis first studies the application of structural SVMs in interactive image segmentation. A novel interactive image segmentation technique that automatically learns segmentation parameters tailored for each and every image is proposed. Unlike existing work, the proposed method does not require any offline parameter tuning or training stage, and is capable of determining image-specific parameters according to some simple user interactions with the target image. The segmentation problem is modeled as an inference of a conditional random field (CRF) over a segmentation mask and the target image. This CRF is parametrized by the weights for different terms (e.g., color, texture and smoothing). These weight parameters are learned via a one-slack structural SVM, which is solved using a constraint approximation scheme and the cutting plane algorithm. Experimental results show that the proposed method, by learning image-specific parameters automatically, outperforms other state-of-the-art interactive image segmentation techniques. This thesis then uses structural SVMs to speed up large scale relatively-paired space analysis. A new multi-modality analysis technique based on relatively-paired observations from multiple modalities is proposed. Relative-pairing information is encoded using relative proximities of observations in a latent common space. By building a discriminative model and maximizing a distance margin, a projection function that maps observations into the latent common space is learned for each modality. However, training based on large scale relatively-paired observations could be extremely time consuming. To this end, the training is reformulated as learning a structural model, which can be optimized by the cutting plane algorithm where only a few training samples are involved in each iteration. Experimental results validate the effectiveness and efficiency of the proposed technique.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.subject.lcsh	Support vector machines	-
dc.subject.lcsh	Computer vision	-
dc.title	Learning structural SVMs and its applications in computer vision	-
dc.type	PG_Thesis	-
dc.identifier.hkul	b5223970	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Computer Science	-
dc.description.nature	published_or_final_version	-
dc.identifier.doi	10.5353/th_b5223970	-
dc.identifier.mmsid	991037034979703414	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

postgraduate thesis: Learning structural SVMs and its applications in computer vision

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats