File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

postgraduate thesis: Using semantic sub-scenes to facilitate scene categorization and understanding

TitleUsing semantic sub-scenes to facilitate scene categorization and understanding
Authors
Advisors
Advisor(s):Yung, NHC
Issue Date2014
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Zhu, S. [朱珊珊]. (2014). Using semantic sub-scenes to facilitate scene categorization and understanding. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5317047
AbstractThis thesis proposes to learn the absent cognitive element in conventional scene categorization methods: sub-scenes, and use them to better categorize and understand scenes. In scene categorization, it has been observed that the problem of ambiguity occurs when treating the scene as a whole. Scene ambiguity arises from when a similar set of sub-scenes are arranged differently to compose different scenes, or when a scene literally contains several categories. However, these ambiguities can be discerned by the knowledge of sub-scenes. Thus, it is worthy to study sub-scenes and use them to better understand a scene. The proposed research firstly considers an unsupervised method to segment sub-scenes. It emphasizes on generating more integral regions instead of over-segmented regions usually produced by conventional segmentation methods. Several properties of sub-scenes are explored such as proximity grouping, area of influence, similarity and harmony based on psychological principles. These properties are formulated into constraints that are used directly in the proposed framework. A self-determined approach is employed to produce a final segmentation result based on the characteristics of each image in an unsupervised manner. The proposed method performs competitively against other state-of-the-art unsupervised segmentation methods with F-measure of 0.55, Covering of 0.51 and VoI of 1.93 in the Berkeley segmentation dataset. In the Stanford background dataset, it achieves the overlapping score of 0.566 which is higher than the score of 0.499 of the comparison method. To segment and label sub-scenes simultaneously, a supervised approach of semantic segmentation is proposed. It is developed based on a Hierarchical Conditional Random Field classification framework. The proposed method integrates contextual information into the model to improve classification performance. Contextual information including global consistency and spatial context are considered in the proposed method. Global consistency is developed based on generalizing the scene by scene types and spatial context takes the spatial relationship into account. The proposed method improves semantic segmentation by boosting more logical class combinations. It achieves the best score in the MSRC-21 dataset with global accuracy at 87% and the average accuracy at 81%, which out-performs all other state-of-the-art methods by 4% individually. In the Stanford background dataset, it achieves global accuracy at 80.5% and average accuracy at 71.8%, also out-performs other methods by 2%. Finally, the proposed research incorporates sub-scenes into the scene categorization framework to improve categorization performance, especially in ambiguity cases. The proposed method encodes the sub-scene in the way that their spatial information is also considered. Sub-scene descriptor compensates the global descriptor of a scene by evaluating local features with specific geometric attributes. The proposed method obtains an average categorization accuracy of 92.26% in the 8 Scene Category dataset, which outperforms all other published methods by over 2% of improvement. It evaluates ambiguity cases more accurately by discerning which part exemplifies a scene category and how those categories are organized.
DegreeDoctor of Philosophy
SubjectImage processing - Digital techniques
Dept/ProgramElectrical and Electronic Engineering
Persistent Identifierhttp://hdl.handle.net/10722/206459
HKU Library Item IDb5317047

 

DC FieldValueLanguage
dc.contributor.advisorYung, NHC-
dc.contributor.authorZhu, Shanshan-
dc.contributor.author朱珊珊-
dc.date.accessioned2014-10-31T23:15:57Z-
dc.date.available2014-10-31T23:15:57Z-
dc.date.issued2014-
dc.identifier.citationZhu, S. [朱珊珊]. (2014). Using semantic sub-scenes to facilitate scene categorization and understanding. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5317047-
dc.identifier.urihttp://hdl.handle.net/10722/206459-
dc.description.abstractThis thesis proposes to learn the absent cognitive element in conventional scene categorization methods: sub-scenes, and use them to better categorize and understand scenes. In scene categorization, it has been observed that the problem of ambiguity occurs when treating the scene as a whole. Scene ambiguity arises from when a similar set of sub-scenes are arranged differently to compose different scenes, or when a scene literally contains several categories. However, these ambiguities can be discerned by the knowledge of sub-scenes. Thus, it is worthy to study sub-scenes and use them to better understand a scene. The proposed research firstly considers an unsupervised method to segment sub-scenes. It emphasizes on generating more integral regions instead of over-segmented regions usually produced by conventional segmentation methods. Several properties of sub-scenes are explored such as proximity grouping, area of influence, similarity and harmony based on psychological principles. These properties are formulated into constraints that are used directly in the proposed framework. A self-determined approach is employed to produce a final segmentation result based on the characteristics of each image in an unsupervised manner. The proposed method performs competitively against other state-of-the-art unsupervised segmentation methods with F-measure of 0.55, Covering of 0.51 and VoI of 1.93 in the Berkeley segmentation dataset. In the Stanford background dataset, it achieves the overlapping score of 0.566 which is higher than the score of 0.499 of the comparison method. To segment and label sub-scenes simultaneously, a supervised approach of semantic segmentation is proposed. It is developed based on a Hierarchical Conditional Random Field classification framework. The proposed method integrates contextual information into the model to improve classification performance. Contextual information including global consistency and spatial context are considered in the proposed method. Global consistency is developed based on generalizing the scene by scene types and spatial context takes the spatial relationship into account. The proposed method improves semantic segmentation by boosting more logical class combinations. It achieves the best score in the MSRC-21 dataset with global accuracy at 87% and the average accuracy at 81%, which out-performs all other state-of-the-art methods by 4% individually. In the Stanford background dataset, it achieves global accuracy at 80.5% and average accuracy at 71.8%, also out-performs other methods by 2%. Finally, the proposed research incorporates sub-scenes into the scene categorization framework to improve categorization performance, especially in ambiguity cases. The proposed method encodes the sub-scene in the way that their spatial information is also considered. Sub-scene descriptor compensates the global descriptor of a scene by evaluating local features with specific geometric attributes. The proposed method obtains an average categorization accuracy of 92.26% in the 8 Scene Category dataset, which outperforms all other published methods by over 2% of improvement. It evaluates ambiguity cases more accurately by discerning which part exemplifies a scene category and how those categories are organized.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.subject.lcshImage processing - Digital techniques-
dc.titleUsing semantic sub-scenes to facilitate scene categorization and understanding-
dc.typePG_Thesis-
dc.identifier.hkulb5317047-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineElectrical and Electronic Engineering-
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.5353/th_b5317047-
dc.identifier.mmsid991039907609703414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats