File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Model adaptation and generalization in the open world
Title | Model adaptation and generalization in the open world |
---|---|
Authors | |
Advisors | Advisor(s):Yu, Y |
Issue Date | 2024 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Chen, C. [陳超奇]. (2024). Model adaptation and generalization in the open world. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Despite the unprecedented evolution of Artificial Intelligence (AI) over the past decade, a well-trained AI model directly deployed in the open world often encounters out-of-distribution (OOD) data whose contexts and labels may significantly differ from those in the training set, resulting in performance degeneration and raising concerns about model reliability. Unlike machines, humans have a strong capability to learn new concepts from just a few examples and rapidly adapt to unseen environments, which is regarded as a key signature of human intelligence from infancy. Inspired by this, we ask an open-ended question: how can we build AI systems with minimal human supervision while achieving a remarkable OOD generalization ability in the open world? In this thesis, we present research on robust machine learning algorithms capable of correcting potential distribution shifts in the open world, aiming to find effective ways to enable model adaptation and generalization by learning from limited and imperfect training data. In particular, we emphasize two essential ingredients to enable model adaptation and generalization: 1) how to represent the structure of data, and 2) how to model the relationships between different entities. To achieve this goal, this thesis presents three efforts to endow machine learning models with the ability to reason the object relations, unearth the semantic topology, and identify unknown instances. The first part of this thesis proposes a graph-based relational reasoning framework for domain adaptive object detection. We make the first attempt to reason the foreground object relationships and interactions via graphical structures, contrasting with the mainstream of pairwise feature alignment. The intra- and inter-domain foreground object relations are modeled on both pixel and semantic spaces, endowing the detection model with the capability of relational reasoning. Through message-passing and feature aggregation, each node aggregates semantic and contextual information from the same and opposite domain to significantly improve its expressive power. The second part proposes a joint latent domain discovery and relation modeling framework for compound domain generalization and a semantic topology-based reasoning approach for domain generalization. We explore two mutually beneficial relational modeling solutions -- contrastive-based and attention-based and show that the dependencies of different classes are robust to unseen domains. Moreover, we propose to reason over the semantic typology of source domains by maintaining and updating local-global prototypical relations. This process is further augmented through semantic-guided data mixing in the input space. The third part devises a unified unknown-aware training and test-time adaptation framework for jointly tackling the challenges of domain shift and open class. Regarding unknown-aware training, our key idea is to make the classifier expandable for the upcoming unknown class only using known-class and virtual unknown-class samples. Regarding test-time adaptation, we propose performing prediction calibration in an online manner using unlabeled test data. We develop both training-free and training-based methods, which, under the guidance of source knowledge and through the integration of target domain knowledge, gradually adapt the source-trained model to the test environments. |
Degree | Doctor of Philosophy |
Subject | Artificial intelligence |
Dept/Program | Computer Science |
Persistent Identifier | http://hdl.handle.net/10722/343760 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Yu, Y | - |
dc.contributor.author | Chen, Chaoqi | - |
dc.contributor.author | 陳超奇 | - |
dc.date.accessioned | 2024-06-06T01:04:46Z | - |
dc.date.available | 2024-06-06T01:04:46Z | - |
dc.date.issued | 2024 | - |
dc.identifier.citation | Chen, C. [陳超奇]. (2024). Model adaptation and generalization in the open world. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/343760 | - |
dc.description.abstract | Despite the unprecedented evolution of Artificial Intelligence (AI) over the past decade, a well-trained AI model directly deployed in the open world often encounters out-of-distribution (OOD) data whose contexts and labels may significantly differ from those in the training set, resulting in performance degeneration and raising concerns about model reliability. Unlike machines, humans have a strong capability to learn new concepts from just a few examples and rapidly adapt to unseen environments, which is regarded as a key signature of human intelligence from infancy. Inspired by this, we ask an open-ended question: how can we build AI systems with minimal human supervision while achieving a remarkable OOD generalization ability in the open world? In this thesis, we present research on robust machine learning algorithms capable of correcting potential distribution shifts in the open world, aiming to find effective ways to enable model adaptation and generalization by learning from limited and imperfect training data. In particular, we emphasize two essential ingredients to enable model adaptation and generalization: 1) how to represent the structure of data, and 2) how to model the relationships between different entities. To achieve this goal, this thesis presents three efforts to endow machine learning models with the ability to reason the object relations, unearth the semantic topology, and identify unknown instances. The first part of this thesis proposes a graph-based relational reasoning framework for domain adaptive object detection. We make the first attempt to reason the foreground object relationships and interactions via graphical structures, contrasting with the mainstream of pairwise feature alignment. The intra- and inter-domain foreground object relations are modeled on both pixel and semantic spaces, endowing the detection model with the capability of relational reasoning. Through message-passing and feature aggregation, each node aggregates semantic and contextual information from the same and opposite domain to significantly improve its expressive power. The second part proposes a joint latent domain discovery and relation modeling framework for compound domain generalization and a semantic topology-based reasoning approach for domain generalization. We explore two mutually beneficial relational modeling solutions -- contrastive-based and attention-based and show that the dependencies of different classes are robust to unseen domains. Moreover, we propose to reason over the semantic typology of source domains by maintaining and updating local-global prototypical relations. This process is further augmented through semantic-guided data mixing in the input space. The third part devises a unified unknown-aware training and test-time adaptation framework for jointly tackling the challenges of domain shift and open class. Regarding unknown-aware training, our key idea is to make the classifier expandable for the upcoming unknown class only using known-class and virtual unknown-class samples. Regarding test-time adaptation, we propose performing prediction calibration in an online manner using unlabeled test data. We develop both training-free and training-based methods, which, under the guidance of source knowledge and through the integration of target domain knowledge, gradually adapt the source-trained model to the test environments. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Artificial intelligence | - |
dc.title | Model adaptation and generalization in the open world | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Computer Science | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2024 | - |
dc.identifier.mmsid | 991044809209403414 | - |