Using semantic sub-scenes to facilitate scene categorization and understanding

Zhu, Shanshan; 朱珊珊

File Download

FullText.pdf

Links for fulltext

(May Require Subscription)

DOI: 10.5353/th_b5317047

Supplementary

Citations:
Appears in Collections:
- Electrical & Electronic Engineering: Theses
- HKU Theses Online

postgraduate thesis: Using semantic sub-scenes to facilitate scene categorization and understanding

Title	Using semantic sub-scenes to facilitate scene categorization and understanding
Authors	Zhu, Shanshan 朱珊珊
Advisors	Advisor(s):Yung, NHC
Issue Date	2014
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Zhu, S. [朱珊珊]. (2014). Using semantic sub-scenes to facilitate scene categorization and understanding. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5317047
Abstract	This thesis proposes to learn the absent cognitive element in conventional scene categorization methods: sub-scenes, and use them to better categorize and understand scenes. In scene categorization, it has been observed that the problem of ambiguity occurs when treating the scene as a whole. Scene ambiguity arises from when a similar set of sub-scenes are arranged differently to compose different scenes, or when a scene literally contains several categories. However, these ambiguities can be discerned by the knowledge of sub-scenes. Thus, it is worthy to study sub-scenes and use them to better understand a scene. The proposed research firstly considers an unsupervised method to segment sub-scenes. It emphasizes on generating more integral regions instead of over-segmented regions usually produced by conventional segmentation methods. Several properties of sub-scenes are explored such as proximity grouping, area of influence, similarity and harmony based on psychological principles. These properties are formulated into constraints that are used directly in the proposed framework. A self-determined approach is employed to produce a final segmentation result based on the characteristics of each image in an unsupervised manner. The proposed method performs competitively against other state-of-the-art unsupervised segmentation methods with F-measure of 0.55, Covering of 0.51 and VoI of 1.93 in the Berkeley segmentation dataset. In the Stanford background dataset, it achieves the overlapping score of 0.566 which is higher than the score of 0.499 of the comparison method. To segment and label sub-scenes simultaneously, a supervised approach of semantic segmentation is proposed. It is developed based on a Hierarchical Conditional Random Field classification framework. The proposed method integrates contextual information into the model to improve classification performance. Contextual information including global consistency and spatial context are considered in the proposed method. Global consistency is developed based on generalizing the scene by scene types and spatial context takes the spatial relationship into account. The proposed method improves semantic segmentation by boosting more logical class combinations. It achieves the best score in the MSRC-21 dataset with global accuracy at 87% and the average accuracy at 81%, which out-performs all other state-of-the-art methods by 4% individually. In the Stanford background dataset, it achieves global accuracy at 80.5% and average accuracy at 71.8%, also out-performs other methods by 2%. Finally, the proposed research incorporates sub-scenes into the scene categorization framework to improve categorization performance, especially in ambiguity cases. The proposed method encodes the sub-scene in the way that their spatial information is also considered. Sub-scene descriptor compensates the global descriptor of a scene by evaluating local features with specific geometric attributes. The proposed method obtains an average categorization accuracy of 92.26% in the 8 Scene Category dataset, which outperforms all other published methods by over 2% of improvement. It evaluates ambiguity cases more accurately by discerning which part exemplifies a scene category and how those categories are organized.
Degree	Doctor of Philosophy
Subject	Image processing - Digital techniques
Dept/Program	Electrical and Electronic Engineering
Persistent Identifier	http://hdl.handle.net/10722/206459
HKU Library Item ID	b5317047

DC Field	Value	Language
dc.contributor.advisor	Yung, NHC	-
dc.contributor.author	Zhu, Shanshan	-
dc.contributor.author	朱珊珊	-
dc.date.accessioned	2014-10-31T23:15:57Z	-
dc.date.available	2014-10-31T23:15:57Z	-
dc.date.issued	2014	-
dc.identifier.citation	Zhu, S. [朱珊珊]. (2014). Using semantic sub-scenes to facilitate scene categorization and understanding. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5317047	-
dc.identifier.uri	http://hdl.handle.net/10722/206459	-
dc.description.abstract	This thesis proposes to learn the absent cognitive element in conventional scene categorization methods: sub-scenes, and use them to better categorize and understand scenes. In scene categorization, it has been observed that the problem of ambiguity occurs when treating the scene as a whole. Scene ambiguity arises from when a similar set of sub-scenes are arranged differently to compose different scenes, or when a scene literally contains several categories. However, these ambiguities can be discerned by the knowledge of sub-scenes. Thus, it is worthy to study sub-scenes and use them to better understand a scene. The proposed research firstly considers an unsupervised method to segment sub-scenes. It emphasizes on generating more integral regions instead of over-segmented regions usually produced by conventional segmentation methods. Several properties of sub-scenes are explored such as proximity grouping, area of influence, similarity and harmony based on psychological principles. These properties are formulated into constraints that are used directly in the proposed framework. A self-determined approach is employed to produce a final segmentation result based on the characteristics of each image in an unsupervised manner. The proposed method performs competitively against other state-of-the-art unsupervised segmentation methods with F-measure of 0.55, Covering of 0.51 and VoI of 1.93 in the Berkeley segmentation dataset. In the Stanford background dataset, it achieves the overlapping score of 0.566 which is higher than the score of 0.499 of the comparison method. To segment and label sub-scenes simultaneously, a supervised approach of semantic segmentation is proposed. It is developed based on a Hierarchical Conditional Random Field classification framework. The proposed method integrates contextual information into the model to improve classification performance. Contextual information including global consistency and spatial context are considered in the proposed method. Global consistency is developed based on generalizing the scene by scene types and spatial context takes the spatial relationship into account. The proposed method improves semantic segmentation by boosting more logical class combinations. It achieves the best score in the MSRC-21 dataset with global accuracy at 87% and the average accuracy at 81%, which out-performs all other state-of-the-art methods by 4% individually. In the Stanford background dataset, it achieves global accuracy at 80.5% and average accuracy at 71.8%, also out-performs other methods by 2%. Finally, the proposed research incorporates sub-scenes into the scene categorization framework to improve categorization performance, especially in ambiguity cases. The proposed method encodes the sub-scene in the way that their spatial information is also considered. Sub-scene descriptor compensates the global descriptor of a scene by evaluating local features with specific geometric attributes. The proposed method obtains an average categorization accuracy of 92.26% in the 8 Scene Category dataset, which outperforms all other published methods by over 2% of improvement. It evaluates ambiguity cases more accurately by discerning which part exemplifies a scene category and how those categories are organized.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.subject.lcsh	Image processing - Digital techniques	-
dc.title	Using semantic sub-scenes to facilitate scene categorization and understanding	-
dc.type	PG_Thesis	-
dc.identifier.hkul	b5317047	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Electrical and Electronic Engineering	-
dc.description.nature	published_or_final_version	-
dc.identifier.doi	10.5353/th_b5317047	-
dc.identifier.mmsid	991039907609703414	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

postgraduate thesis: Using semantic sub-scenes to facilitate scene categorization and understanding

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats