File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Exploring heterogeneous models and prompt learning in federated learning
| Title | Exploring heterogeneous models and prompt learning in federated learning |
|---|---|
| Authors | |
| Advisors | |
| Issue Date | 2025 |
| Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
| Citation | Chan, Y. H. [陳潤軒]. (2025). Exploring heterogeneous models and prompt learning in federated learning. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
| Abstract | In the era of rapid advancements in the Internet of Things (IoT) and artificial intelligence, the deployment of neural networks on IoT devices is becoming increasingly crucial for edge intelligence. Federated learning (FL) facilitates the management of edge devices to collaboratively train a shared model while maintaining training data local and private. However, a general assumption in FL is that all edge devices are trained on the identical shape of neural networks, which may be impractical considering heterogeneous capabilities of different devices. Furthermore, the proliferation of large pre-trained models is sparking growing research interests in deploying and fine-tuning these models within FL. However, a fundamental conflict arises between pre-trained models and FL, attributed to the extensive computational resources required for training and deploying such models, which are beyond the reach of FL participants.
To address these challenges, this thesis proposes several FL methods that support heterogeneous client models and address the conflict from resource limitations for deploying pre-trained models in FL.
Firstly, we introduce three data-free FL methods, FedHe, Felo, and Velo, which manage features and logits to facilitate knowledge transfer among clients. FedHe leverages average logits, while Felo aggregates mid-level features and logits based on class labels. Velo extends Felo by incorporating a conditional variational auto-encoder (VAE) on the server to generate synthetic features for client optimization. Extensive experiments demonstrate the superior performance of these methods compared to state-of-the-art techniques.
Secondly, we propose Federated Intermediate Layers Learning (FedIN), a framework that supports heterogeneous models without relying on public datasets. FedIN facilitates knowledge transfer among client models by utilizing client features through the training of Intermediate Layers (IN). This method aligns intermediate layers based on features derived from other clients, thereby ensuring efficient knowledge exchange. This approach requires minimal memory and communication overhead, and addresses gradient divergence challenges through convex optimization. Experimental results highlight the superior performance of FedIN and privacy protection capabilities.
Thirdly, we commence our study with a detailed exploration of homogeneous and heterogeneous FL settings and discover key observations. Building upon these observations, we present InCo Aggregation, a training scheme that enhances the capabilities of model-homogeneous FL methods to handle system heterogeneity. By leveraging internal cross-layer gradients, InCo Aggregation augments deep layer similarity without additional client communication. This method can be adapted to various FL algorithms, such as FedAvg and FedProx, to improve performance in heterogeneous settings. Experimental results validate the effectiveness of InCo Aggregation.
Lastly, we address the conflict between the resource demands of pre-trained models and the limited resources of FL participants through Split Federated Learning (SFL). We propose PromptSFL, which adapts Visual Prompt Tuning (VPT) for SFL by aligning feature spaces of prompts between clients and the server. PromptSFL transmits skip prompts from clients to the server and employs a linear layer to map client prompts to the feature space of the server, preventing client prompts from overfitting to local datasets. Additionally, an adaptive learning rate enhances PromptSFL convergence speed. Extensive experiments demonstrate the effectiveness and efficiency of PromptSFL. |
| Degree | Doctor of Philosophy |
| Subject | Federated learning (Machine learning) |
| Dept/Program | Electrical and Electronic Engineering |
| Persistent Identifier | http://hdl.handle.net/10722/356599 |
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.advisor | Ngai, CHE | - |
| dc.contributor.advisor | Huang, K | - |
| dc.contributor.author | Chan, Yun Hin | - |
| dc.contributor.author | 陳潤軒 | - |
| dc.date.accessioned | 2025-06-05T09:31:22Z | - |
| dc.date.available | 2025-06-05T09:31:22Z | - |
| dc.date.issued | 2025 | - |
| dc.identifier.citation | Chan, Y. H. [陳潤軒]. (2025). Exploring heterogeneous models and prompt learning in federated learning. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
| dc.identifier.uri | http://hdl.handle.net/10722/356599 | - |
| dc.description.abstract | In the era of rapid advancements in the Internet of Things (IoT) and artificial intelligence, the deployment of neural networks on IoT devices is becoming increasingly crucial for edge intelligence. Federated learning (FL) facilitates the management of edge devices to collaboratively train a shared model while maintaining training data local and private. However, a general assumption in FL is that all edge devices are trained on the identical shape of neural networks, which may be impractical considering heterogeneous capabilities of different devices. Furthermore, the proliferation of large pre-trained models is sparking growing research interests in deploying and fine-tuning these models within FL. However, a fundamental conflict arises between pre-trained models and FL, attributed to the extensive computational resources required for training and deploying such models, which are beyond the reach of FL participants. To address these challenges, this thesis proposes several FL methods that support heterogeneous client models and address the conflict from resource limitations for deploying pre-trained models in FL. Firstly, we introduce three data-free FL methods, FedHe, Felo, and Velo, which manage features and logits to facilitate knowledge transfer among clients. FedHe leverages average logits, while Felo aggregates mid-level features and logits based on class labels. Velo extends Felo by incorporating a conditional variational auto-encoder (VAE) on the server to generate synthetic features for client optimization. Extensive experiments demonstrate the superior performance of these methods compared to state-of-the-art techniques. Secondly, we propose Federated Intermediate Layers Learning (FedIN), a framework that supports heterogeneous models without relying on public datasets. FedIN facilitates knowledge transfer among client models by utilizing client features through the training of Intermediate Layers (IN). This method aligns intermediate layers based on features derived from other clients, thereby ensuring efficient knowledge exchange. This approach requires minimal memory and communication overhead, and addresses gradient divergence challenges through convex optimization. Experimental results highlight the superior performance of FedIN and privacy protection capabilities. Thirdly, we commence our study with a detailed exploration of homogeneous and heterogeneous FL settings and discover key observations. Building upon these observations, we present InCo Aggregation, a training scheme that enhances the capabilities of model-homogeneous FL methods to handle system heterogeneity. By leveraging internal cross-layer gradients, InCo Aggregation augments deep layer similarity without additional client communication. This method can be adapted to various FL algorithms, such as FedAvg and FedProx, to improve performance in heterogeneous settings. Experimental results validate the effectiveness of InCo Aggregation. Lastly, we address the conflict between the resource demands of pre-trained models and the limited resources of FL participants through Split Federated Learning (SFL). We propose PromptSFL, which adapts Visual Prompt Tuning (VPT) for SFL by aligning feature spaces of prompts between clients and the server. PromptSFL transmits skip prompts from clients to the server and employs a linear layer to map client prompts to the feature space of the server, preventing client prompts from overfitting to local datasets. Additionally, an adaptive learning rate enhances PromptSFL convergence speed. Extensive experiments demonstrate the effectiveness and efficiency of PromptSFL. | - |
| dc.language | eng | - |
| dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
| dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
| dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
| dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
| dc.subject.lcsh | Federated learning (Machine learning) | - |
| dc.title | Exploring heterogeneous models and prompt learning in federated learning | - |
| dc.type | PG_Thesis | - |
| dc.description.thesisname | Doctor of Philosophy | - |
| dc.description.thesislevel | Doctoral | - |
| dc.description.thesisdiscipline | Electrical and Electronic Engineering | - |
| dc.description.nature | published_or_final_version | - |
| dc.date.hkucongregation | 2025 | - |
| dc.identifier.mmsid | 991044970873503414 | - |
