Model adaptation and generalization in the open world

Chen, Chaoqi; 陳超奇

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Computer Science: Theses

postgraduate thesis: Model adaptation and generalization in the open world

Title	Model adaptation and generalization in the open world
Authors	Chen, Chaoqi 陳超奇
Advisors	Advisor(s):Yu, Y
Issue Date	2024
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Chen, C. [陳超奇]. (2024). Model adaptation and generalization in the open world. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Despite the unprecedented evolution of Artificial Intelligence (AI) over the past decade, a well-trained AI model directly deployed in the open world often encounters out-of-distribution (OOD) data whose contexts and labels may significantly differ from those in the training set, resulting in performance degeneration and raising concerns about model reliability. Unlike machines, humans have a strong capability to learn new concepts from just a few examples and rapidly adapt to unseen environments, which is regarded as a key signature of human intelligence from infancy. Inspired by this, we ask an open-ended question: how can we build AI systems with minimal human supervision while achieving a remarkable OOD generalization ability in the open world? In this thesis, we present research on robust machine learning algorithms capable of correcting potential distribution shifts in the open world, aiming to find effective ways to enable model adaptation and generalization by learning from limited and imperfect training data. In particular, we emphasize two essential ingredients to enable model adaptation and generalization: 1) how to represent the structure of data, and 2) how to model the relationships between different entities. To achieve this goal, this thesis presents three efforts to endow machine learning models with the ability to reason the object relations, unearth the semantic topology, and identify unknown instances. The first part of this thesis proposes a graph-based relational reasoning framework for domain adaptive object detection. We make the first attempt to reason the foreground object relationships and interactions via graphical structures, contrasting with the mainstream of pairwise feature alignment. The intra- and inter-domain foreground object relations are modeled on both pixel and semantic spaces, endowing the detection model with the capability of relational reasoning. Through message-passing and feature aggregation, each node aggregates semantic and contextual information from the same and opposite domain to significantly improve its expressive power. The second part proposes a joint latent domain discovery and relation modeling framework for compound domain generalization and a semantic topology-based reasoning approach for domain generalization. We explore two mutually beneficial relational modeling solutions -- contrastive-based and attention-based and show that the dependencies of different classes are robust to unseen domains. Moreover, we propose to reason over the semantic typology of source domains by maintaining and updating local-global prototypical relations. This process is further augmented through semantic-guided data mixing in the input space. The third part devises a unified unknown-aware training and test-time adaptation framework for jointly tackling the challenges of domain shift and open class. Regarding unknown-aware training, our key idea is to make the classifier expandable for the upcoming unknown class only using known-class and virtual unknown-class samples. Regarding test-time adaptation, we propose performing prediction calibration in an online manner using unlabeled test data. We develop both training-free and training-based methods, which, under the guidance of source knowledge and through the integration of target domain knowledge, gradually adapt the source-trained model to the test environments.
Degree	Doctor of Philosophy
Subject	Artificial intelligence
Dept/Program	Computer Science
Persistent Identifier	http://hdl.handle.net/10722/343760

DC Field	Value	Language
dc.contributor.advisor	Yu, Y	-
dc.contributor.author	Chen, Chaoqi	-
dc.contributor.author	陳超奇	-
dc.date.accessioned	2024-06-06T01:04:46Z	-
dc.date.available	2024-06-06T01:04:46Z	-
dc.date.issued	2024	-
dc.identifier.citation	Chen, C. [陳超奇]. (2024). Model adaptation and generalization in the open world. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/343760	-
dc.description.abstract	Despite the unprecedented evolution of Artificial Intelligence (AI) over the past decade, a well-trained AI model directly deployed in the open world often encounters out-of-distribution (OOD) data whose contexts and labels may significantly differ from those in the training set, resulting in performance degeneration and raising concerns about model reliability. Unlike machines, humans have a strong capability to learn new concepts from just a few examples and rapidly adapt to unseen environments, which is regarded as a key signature of human intelligence from infancy. Inspired by this, we ask an open-ended question: how can we build AI systems with minimal human supervision while achieving a remarkable OOD generalization ability in the open world? In this thesis, we present research on robust machine learning algorithms capable of correcting potential distribution shifts in the open world, aiming to find effective ways to enable model adaptation and generalization by learning from limited and imperfect training data. In particular, we emphasize two essential ingredients to enable model adaptation and generalization: 1) how to represent the structure of data, and 2) how to model the relationships between different entities. To achieve this goal, this thesis presents three efforts to endow machine learning models with the ability to reason the object relations, unearth the semantic topology, and identify unknown instances. The first part of this thesis proposes a graph-based relational reasoning framework for domain adaptive object detection. We make the first attempt to reason the foreground object relationships and interactions via graphical structures, contrasting with the mainstream of pairwise feature alignment. The intra- and inter-domain foreground object relations are modeled on both pixel and semantic spaces, endowing the detection model with the capability of relational reasoning. Through message-passing and feature aggregation, each node aggregates semantic and contextual information from the same and opposite domain to significantly improve its expressive power. The second part proposes a joint latent domain discovery and relation modeling framework for compound domain generalization and a semantic topology-based reasoning approach for domain generalization. We explore two mutually beneficial relational modeling solutions -- contrastive-based and attention-based and show that the dependencies of different classes are robust to unseen domains. Moreover, we propose to reason over the semantic typology of source domains by maintaining and updating local-global prototypical relations. This process is further augmented through semantic-guided data mixing in the input space. The third part devises a unified unknown-aware training and test-time adaptation framework for jointly tackling the challenges of domain shift and open class. Regarding unknown-aware training, our key idea is to make the classifier expandable for the upcoming unknown class only using known-class and virtual unknown-class samples. Regarding test-time adaptation, we propose performing prediction calibration in an online manner using unlabeled test data. We develop both training-free and training-based methods, which, under the guidance of source knowledge and through the integration of target domain knowledge, gradually adapt the source-trained model to the test environments.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Artificial intelligence	-
dc.title	Model adaptation and generalization in the open world	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Computer Science	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2024	-
dc.identifier.mmsid	991044809209403414	-

File Download

Supplementary

postgraduate thesis: Model adaptation and generalization in the open world

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats