File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Optimizing federated learning with communication reduction and synchronization control
Title | Optimizing federated learning with communication reduction and synchronization control |
---|---|
Authors | |
Advisors | Advisor(s):Wang, CL |
Issue Date | 2023 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Wu, X. [吴雪玉]. (2023). Optimizing federated learning with communication reduction and synchronization control. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Machine Learning (ML) models have brought breakthroughs in natural language processing, speech recognition, computer vision, and many other fields. These ML models have gone deeper in recent years, requiring large-scale datasets to produce more accurate models. However, the traditional datacenter approach often necessitates uploading the input data to the centralized server for model training. With the ever-increasing concerns about data privacy, such an approach is facing tremendous challenges when privacy is paramount.
Federated Learning (FL) becomes a promising alternative ML paradigm, aiming to leverage user data for model training in a privacy-preserving way. In the FL setting, the edge devices collaboratively train a shared model under the orchestration of a central server while keeping data locally. Yet, the training performance is often poor due to various reasons, such as huge communication overhead, system heterogeneity, and data heterogeneity. The cross-device FL training usually involves a massive number of resource-weak edge devices connected to intermittent networks, exhibiting a vastly heterogeneous environment. The unbalanced and non-independent and identically distributed (Non-IID) property of data distribution also leads to an accuracy degradation. This thesis highlights the primary challenges in FL systems and proposes three efficient optimization strategies that are specialized for the cross-device federated learning setting.
We first propose a structure-based communication reduction algorithm, named FedSCR, to reduce the number of parameters transferred via the network without compromising accuracy. We study the parameter update pattern under the Non-IID setting and discover the update tendency over channels and filters. Herein, FedSCR effectively quantifies the magnitude of model updates by aggregating them over channels and filters, and then removes the insignificant updates by comparing the aggregated values with a threshold. We further introduce an adaptive FedSCR that dynamically adjusts the threshold for different clients based on their weight divergence, the magnitude of local updates, and data distributions.
Then, we address the straggler problem by presenting KAFL, a fast K asynchronous FL framework to control the model synchronization under data and resource heterogeneity. KAFL includes two synchronization algorithms, named K-FedAsync and M-step-FedAsync, which make a trade-off between system efficiency and model accuracy. Compared to the fully asynchronous manner, they help the server reduce training bias and obtain a better direction toward the global optima as it collects information from at least K clients or M different model updates. We also introduce a weighted aggregation method to determine the aggregation weight for each client according to its weight deviation and contribution frequency.
Finally, we extend the FL training from deep neural networks (DNNs) to graph neural networks (GNNs) and propose an EmbC-FGNN framework. EmbC-FGNN enables node embedding communication among edge clients with privacy guarantees. It contains an embedding server (ES) to maintain the shared embedding vectors, and allows the edge clients to expand their local subgraphs without revealing local node features and graph topology. To minimize the communication costs of the ES, EmbC-FGNN advocates a periodical embedding synchronization strategy to reduce communication frequency, and it further utilizes K-FedAsync and M-step-FedAsync algorithms to accelerate convergence speed. |
Degree | Doctor of Philosophy |
Subject | Machine learning |
Dept/Program | Computer Science |
Persistent Identifier | http://hdl.handle.net/10722/328561 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Wang, CL | - |
dc.contributor.author | Wu, Xueyu | - |
dc.contributor.author | 吴雪玉 | - |
dc.date.accessioned | 2023-06-29T05:44:14Z | - |
dc.date.available | 2023-06-29T05:44:14Z | - |
dc.date.issued | 2023 | - |
dc.identifier.citation | Wu, X. [吴雪玉]. (2023). Optimizing federated learning with communication reduction and synchronization control. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/328561 | - |
dc.description.abstract | Machine Learning (ML) models have brought breakthroughs in natural language processing, speech recognition, computer vision, and many other fields. These ML models have gone deeper in recent years, requiring large-scale datasets to produce more accurate models. However, the traditional datacenter approach often necessitates uploading the input data to the centralized server for model training. With the ever-increasing concerns about data privacy, such an approach is facing tremendous challenges when privacy is paramount. Federated Learning (FL) becomes a promising alternative ML paradigm, aiming to leverage user data for model training in a privacy-preserving way. In the FL setting, the edge devices collaboratively train a shared model under the orchestration of a central server while keeping data locally. Yet, the training performance is often poor due to various reasons, such as huge communication overhead, system heterogeneity, and data heterogeneity. The cross-device FL training usually involves a massive number of resource-weak edge devices connected to intermittent networks, exhibiting a vastly heterogeneous environment. The unbalanced and non-independent and identically distributed (Non-IID) property of data distribution also leads to an accuracy degradation. This thesis highlights the primary challenges in FL systems and proposes three efficient optimization strategies that are specialized for the cross-device federated learning setting. We first propose a structure-based communication reduction algorithm, named FedSCR, to reduce the number of parameters transferred via the network without compromising accuracy. We study the parameter update pattern under the Non-IID setting and discover the update tendency over channels and filters. Herein, FedSCR effectively quantifies the magnitude of model updates by aggregating them over channels and filters, and then removes the insignificant updates by comparing the aggregated values with a threshold. We further introduce an adaptive FedSCR that dynamically adjusts the threshold for different clients based on their weight divergence, the magnitude of local updates, and data distributions. Then, we address the straggler problem by presenting KAFL, a fast K asynchronous FL framework to control the model synchronization under data and resource heterogeneity. KAFL includes two synchronization algorithms, named K-FedAsync and M-step-FedAsync, which make a trade-off between system efficiency and model accuracy. Compared to the fully asynchronous manner, they help the server reduce training bias and obtain a better direction toward the global optima as it collects information from at least K clients or M different model updates. We also introduce a weighted aggregation method to determine the aggregation weight for each client according to its weight deviation and contribution frequency. Finally, we extend the FL training from deep neural networks (DNNs) to graph neural networks (GNNs) and propose an EmbC-FGNN framework. EmbC-FGNN enables node embedding communication among edge clients with privacy guarantees. It contains an embedding server (ES) to maintain the shared embedding vectors, and allows the edge clients to expand their local subgraphs without revealing local node features and graph topology. To minimize the communication costs of the ES, EmbC-FGNN advocates a periodical embedding synchronization strategy to reduce communication frequency, and it further utilizes K-FedAsync and M-step-FedAsync algorithms to accelerate convergence speed. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Machine learning | - |
dc.title | Optimizing federated learning with communication reduction and synchronization control | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Computer Science | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2023 | - |
dc.identifier.mmsid | 991044695781903414 | - |