Optimizing federated learning with communication reduction and synchronization control

Wu, Xueyu; 吴雪玉

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Computer Science: Theses

postgraduate thesis: Optimizing federated learning with communication reduction and synchronization control

Title	Optimizing federated learning with communication reduction and synchronization control
Authors	Wu, Xueyu 吴雪玉
Advisors	Advisor(s):Wang, CL
Issue Date	2023
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Wu, X. [吴雪玉]. (2023). Optimizing federated learning with communication reduction and synchronization control. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Machine Learning (ML) models have brought breakthroughs in natural language processing, speech recognition, computer vision, and many other fields. These ML models have gone deeper in recent years, requiring large-scale datasets to produce more accurate models. However, the traditional datacenter approach often necessitates uploading the input data to the centralized server for model training. With the ever-increasing concerns about data privacy, such an approach is facing tremendous challenges when privacy is paramount. Federated Learning (FL) becomes a promising alternative ML paradigm, aiming to leverage user data for model training in a privacy-preserving way. In the FL setting, the edge devices collaboratively train a shared model under the orchestration of a central server while keeping data locally. Yet, the training performance is often poor due to various reasons, such as huge communication overhead, system heterogeneity, and data heterogeneity. The cross-device FL training usually involves a massive number of resource-weak edge devices connected to intermittent networks, exhibiting a vastly heterogeneous environment. The unbalanced and non-independent and identically distributed (Non-IID) property of data distribution also leads to an accuracy degradation. This thesis highlights the primary challenges in FL systems and proposes three efficient optimization strategies that are specialized for the cross-device federated learning setting. We first propose a structure-based communication reduction algorithm, named FedSCR, to reduce the number of parameters transferred via the network without compromising accuracy. We study the parameter update pattern under the Non-IID setting and discover the update tendency over channels and filters. Herein, FedSCR effectively quantifies the magnitude of model updates by aggregating them over channels and filters, and then removes the insignificant updates by comparing the aggregated values with a threshold. We further introduce an adaptive FedSCR that dynamically adjusts the threshold for different clients based on their weight divergence, the magnitude of local updates, and data distributions. Then, we address the straggler problem by presenting KAFL, a fast K asynchronous FL framework to control the model synchronization under data and resource heterogeneity. KAFL includes two synchronization algorithms, named K-FedAsync and M-step-FedAsync, which make a trade-off between system efficiency and model accuracy. Compared to the fully asynchronous manner, they help the server reduce training bias and obtain a better direction toward the global optima as it collects information from at least K clients or M different model updates. We also introduce a weighted aggregation method to determine the aggregation weight for each client according to its weight deviation and contribution frequency. Finally, we extend the FL training from deep neural networks (DNNs) to graph neural networks (GNNs) and propose an EmbC-FGNN framework. EmbC-FGNN enables node embedding communication among edge clients with privacy guarantees. It contains an embedding server (ES) to maintain the shared embedding vectors, and allows the edge clients to expand their local subgraphs without revealing local node features and graph topology. To minimize the communication costs of the ES, EmbC-FGNN advocates a periodical embedding synchronization strategy to reduce communication frequency, and it further utilizes K-FedAsync and M-step-FedAsync algorithms to accelerate convergence speed.
Degree	Doctor of Philosophy
Subject	Machine learning
Dept/Program	Computer Science
Persistent Identifier	http://hdl.handle.net/10722/328561

DC Field	Value	Language
dc.contributor.advisor	Wang, CL	-
dc.contributor.author	Wu, Xueyu	-
dc.contributor.author	吴雪玉	-
dc.date.accessioned	2023-06-29T05:44:14Z	-
dc.date.available	2023-06-29T05:44:14Z	-
dc.date.issued	2023	-
dc.identifier.citation	Wu, X. [吴雪玉]. (2023). Optimizing federated learning with communication reduction and synchronization control. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/328561	-
dc.description.abstract	Machine Learning (ML) models have brought breakthroughs in natural language processing, speech recognition, computer vision, and many other fields. These ML models have gone deeper in recent years, requiring large-scale datasets to produce more accurate models. However, the traditional datacenter approach often necessitates uploading the input data to the centralized server for model training. With the ever-increasing concerns about data privacy, such an approach is facing tremendous challenges when privacy is paramount. Federated Learning (FL) becomes a promising alternative ML paradigm, aiming to leverage user data for model training in a privacy-preserving way. In the FL setting, the edge devices collaboratively train a shared model under the orchestration of a central server while keeping data locally. Yet, the training performance is often poor due to various reasons, such as huge communication overhead, system heterogeneity, and data heterogeneity. The cross-device FL training usually involves a massive number of resource-weak edge devices connected to intermittent networks, exhibiting a vastly heterogeneous environment. The unbalanced and non-independent and identically distributed (Non-IID) property of data distribution also leads to an accuracy degradation. This thesis highlights the primary challenges in FL systems and proposes three efficient optimization strategies that are specialized for the cross-device federated learning setting. We first propose a structure-based communication reduction algorithm, named FedSCR, to reduce the number of parameters transferred via the network without compromising accuracy. We study the parameter update pattern under the Non-IID setting and discover the update tendency over channels and filters. Herein, FedSCR effectively quantifies the magnitude of model updates by aggregating them over channels and filters, and then removes the insignificant updates by comparing the aggregated values with a threshold. We further introduce an adaptive FedSCR that dynamically adjusts the threshold for different clients based on their weight divergence, the magnitude of local updates, and data distributions. Then, we address the straggler problem by presenting KAFL, a fast K asynchronous FL framework to control the model synchronization under data and resource heterogeneity. KAFL includes two synchronization algorithms, named K-FedAsync and M-step-FedAsync, which make a trade-off between system efficiency and model accuracy. Compared to the fully asynchronous manner, they help the server reduce training bias and obtain a better direction toward the global optima as it collects information from at least K clients or M different model updates. We also introduce a weighted aggregation method to determine the aggregation weight for each client according to its weight deviation and contribution frequency. Finally, we extend the FL training from deep neural networks (DNNs) to graph neural networks (GNNs) and propose an EmbC-FGNN framework. EmbC-FGNN enables node embedding communication among edge clients with privacy guarantees. It contains an embedding server (ES) to maintain the shared embedding vectors, and allows the edge clients to expand their local subgraphs without revealing local node features and graph topology. To minimize the communication costs of the ES, EmbC-FGNN advocates a periodical embedding synchronization strategy to reduce communication frequency, and it further utilizes K-FedAsync and M-step-FedAsync algorithms to accelerate convergence speed.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Machine learning	-
dc.title	Optimizing federated learning with communication reduction and synchronization control	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Computer Science	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2023	-
dc.identifier.mmsid	991044695781903414	-

File Download

Supplementary

postgraduate thesis: Optimizing federated learning with communication reduction and synchronization control

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats