File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Using semantic sub-scenes to facilitate scene categorization and understanding
Title | Using semantic sub-scenes to facilitate scene categorization and understanding |
---|---|
Authors | |
Advisors | Advisor(s):Yung, NHC |
Issue Date | 2014 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Zhu, S. [朱珊珊]. (2014). Using semantic sub-scenes to facilitate scene categorization and understanding. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5317047 |
Abstract | This thesis proposes to learn the absent cognitive element in conventional scene categorization methods: sub-scenes, and use them to better categorize and understand scenes. In scene categorization, it has been observed that the problem of ambiguity occurs when treating the scene as a whole. Scene ambiguity arises from when a similar set of sub-scenes are arranged differently to compose different scenes, or when a scene literally contains several categories. However, these ambiguities can be discerned by the knowledge of sub-scenes. Thus, it is worthy to study sub-scenes and use them to better understand a scene.
The proposed research firstly considers an unsupervised method to segment sub-scenes. It emphasizes on generating more integral regions instead of over-segmented regions usually produced by conventional segmentation methods. Several properties of sub-scenes are explored such as proximity grouping, area of influence, similarity and harmony based on psychological principles. These properties are formulated into constraints that are used directly in the proposed framework. A self-determined approach is employed to produce a final segmentation result based on the characteristics of each image in an unsupervised manner. The proposed method performs competitively against other state-of-the-art unsupervised segmentation methods with F-measure of 0.55, Covering of 0.51 and VoI of 1.93 in the Berkeley segmentation dataset. In the Stanford background dataset, it achieves the overlapping score of 0.566 which is higher than the score of 0.499 of the comparison method.
To segment and label sub-scenes simultaneously, a supervised approach of semantic segmentation is proposed. It is developed based on a Hierarchical Conditional Random Field classification framework. The proposed method integrates contextual information into the model to improve classification performance. Contextual information including global consistency and spatial context are considered in the proposed method. Global consistency is developed based on generalizing the scene by scene types and spatial context takes the spatial relationship into account. The proposed method improves semantic segmentation by boosting more logical class combinations. It achieves the best score in the MSRC-21 dataset with global accuracy at 87% and the average accuracy at 81%, which out-performs all other state-of-the-art methods by 4% individually. In the Stanford background dataset, it achieves global accuracy at 80.5% and average accuracy at 71.8%, also out-performs other methods by 2%.
Finally, the proposed research incorporates sub-scenes into the scene categorization framework to improve categorization performance, especially in ambiguity cases. The proposed method encodes the sub-scene in the way that their spatial information is also considered. Sub-scene descriptor compensates the global descriptor of a scene by evaluating local features with specific geometric attributes. The proposed method obtains an average categorization accuracy of 92.26% in the 8 Scene Category dataset, which outperforms all other published methods by over 2% of improvement. It evaluates ambiguity cases more accurately by discerning which part exemplifies a scene category and how those categories are organized. |
Degree | Doctor of Philosophy |
Subject | Image processing - Digital techniques |
Dept/Program | Electrical and Electronic Engineering |
Persistent Identifier | http://hdl.handle.net/10722/206459 |
HKU Library Item ID | b5317047 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Yung, NHC | - |
dc.contributor.author | Zhu, Shanshan | - |
dc.contributor.author | 朱珊珊 | - |
dc.date.accessioned | 2014-10-31T23:15:57Z | - |
dc.date.available | 2014-10-31T23:15:57Z | - |
dc.date.issued | 2014 | - |
dc.identifier.citation | Zhu, S. [朱珊珊]. (2014). Using semantic sub-scenes to facilitate scene categorization and understanding. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5317047 | - |
dc.identifier.uri | http://hdl.handle.net/10722/206459 | - |
dc.description.abstract | This thesis proposes to learn the absent cognitive element in conventional scene categorization methods: sub-scenes, and use them to better categorize and understand scenes. In scene categorization, it has been observed that the problem of ambiguity occurs when treating the scene as a whole. Scene ambiguity arises from when a similar set of sub-scenes are arranged differently to compose different scenes, or when a scene literally contains several categories. However, these ambiguities can be discerned by the knowledge of sub-scenes. Thus, it is worthy to study sub-scenes and use them to better understand a scene. The proposed research firstly considers an unsupervised method to segment sub-scenes. It emphasizes on generating more integral regions instead of over-segmented regions usually produced by conventional segmentation methods. Several properties of sub-scenes are explored such as proximity grouping, area of influence, similarity and harmony based on psychological principles. These properties are formulated into constraints that are used directly in the proposed framework. A self-determined approach is employed to produce a final segmentation result based on the characteristics of each image in an unsupervised manner. The proposed method performs competitively against other state-of-the-art unsupervised segmentation methods with F-measure of 0.55, Covering of 0.51 and VoI of 1.93 in the Berkeley segmentation dataset. In the Stanford background dataset, it achieves the overlapping score of 0.566 which is higher than the score of 0.499 of the comparison method. To segment and label sub-scenes simultaneously, a supervised approach of semantic segmentation is proposed. It is developed based on a Hierarchical Conditional Random Field classification framework. The proposed method integrates contextual information into the model to improve classification performance. Contextual information including global consistency and spatial context are considered in the proposed method. Global consistency is developed based on generalizing the scene by scene types and spatial context takes the spatial relationship into account. The proposed method improves semantic segmentation by boosting more logical class combinations. It achieves the best score in the MSRC-21 dataset with global accuracy at 87% and the average accuracy at 81%, which out-performs all other state-of-the-art methods by 4% individually. In the Stanford background dataset, it achieves global accuracy at 80.5% and average accuracy at 71.8%, also out-performs other methods by 2%. Finally, the proposed research incorporates sub-scenes into the scene categorization framework to improve categorization performance, especially in ambiguity cases. The proposed method encodes the sub-scene in the way that their spatial information is also considered. Sub-scene descriptor compensates the global descriptor of a scene by evaluating local features with specific geometric attributes. The proposed method obtains an average categorization accuracy of 92.26% in the 8 Scene Category dataset, which outperforms all other published methods by over 2% of improvement. It evaluates ambiguity cases more accurately by discerning which part exemplifies a scene category and how those categories are organized. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.subject.lcsh | Image processing - Digital techniques | - |
dc.title | Using semantic sub-scenes to facilitate scene categorization and understanding | - |
dc.type | PG_Thesis | - |
dc.identifier.hkul | b5317047 | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Electrical and Electronic Engineering | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.5353/th_b5317047 | - |
dc.identifier.mmsid | 991039907609703414 | - |