File Download
Supplementary

postgraduate thesis: Memory and communication efficient protocols for machine learning applications

TitleMemory and communication efficient protocols for machine learning applications
Authors
Advisors
Advisor(s):Chan, HTH
Issue Date2023
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Wen, T. [温婷]. (2023). Memory and communication efficient protocols for machine learning applications. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractMachine learning has been widely used in real-life applications. In this thesis, we study memory and communication efficient protocols for a challenging machine learning task, stock price prediction. First, we propose the Adaptive Aggregated-Relational Transformer (AART) model, which can achieve the following features:(1) capturing correlation between prices of different stocks, (2) discovering long-term trends in the extensive history of stocks, (3) recognizing short-term patterns within a finer time scale. The inter-stock effect is extracted by aggregating neighbors' information in Transformer Encoder. Furthermore, a data compression method is introduced to extend historical data while ensuring computational efficiency, and one-dimensional convolution is used to capture short-term patterns in historical data. To accelerate training, a centralized distributed system is applied that employs the Adam optimizer with top-k sparsification, adaptive learning rates, and error feedback mechanisms. The combination of sparsification and error feedback techniques can significantly reduce computational cost without compromising the accuracy of the generated model, making it an effective method for training complex neural networks. Furthermore, we consider selecting a leader among the worker nodes to perform additional tasks as the central server in distributed systems. In a game theoretically fair leader election protocol, roughly speaking, we want that even a majority coalition cannot hurt the chance of any honest individual (maximin fairness). The folklore tournament protocol in $\Theta(\log(n))$ rounds, with $n$ being the number of nodes, is utilized to implement the maximin-fair leader selection among the worker nodes in the network. Our research shows that $\Theta(\log(n))$ represents the lower bound for round complexity under the commit-reveal framework against computationally bounded malicious adversaries. Extensive empirical studies on real-world data demonstrate that the AART model, combined with distributed training, results in improved accuracy and computational efficiency when compared to state-of-the-art competitors. This study contributes to the research on developing fair and communication-efficient distributed systems for machine learning tasks with promising practical applications in various domains.
DegreeDoctor of Philosophy
SubjectMachine learning
Stock price forecasting - Computer simulation
Dept/ProgramComputer Science
Persistent Identifierhttp://hdl.handle.net/10722/328918

 

DC FieldValueLanguage
dc.contributor.advisorChan, HTH-
dc.contributor.authorWen, Ting-
dc.contributor.author温婷-
dc.date.accessioned2023-08-01T06:48:14Z-
dc.date.available2023-08-01T06:48:14Z-
dc.date.issued2023-
dc.identifier.citationWen, T. [温婷]. (2023). Memory and communication efficient protocols for machine learning applications. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/328918-
dc.description.abstractMachine learning has been widely used in real-life applications. In this thesis, we study memory and communication efficient protocols for a challenging machine learning task, stock price prediction. First, we propose the Adaptive Aggregated-Relational Transformer (AART) model, which can achieve the following features:(1) capturing correlation between prices of different stocks, (2) discovering long-term trends in the extensive history of stocks, (3) recognizing short-term patterns within a finer time scale. The inter-stock effect is extracted by aggregating neighbors' information in Transformer Encoder. Furthermore, a data compression method is introduced to extend historical data while ensuring computational efficiency, and one-dimensional convolution is used to capture short-term patterns in historical data. To accelerate training, a centralized distributed system is applied that employs the Adam optimizer with top-k sparsification, adaptive learning rates, and error feedback mechanisms. The combination of sparsification and error feedback techniques can significantly reduce computational cost without compromising the accuracy of the generated model, making it an effective method for training complex neural networks. Furthermore, we consider selecting a leader among the worker nodes to perform additional tasks as the central server in distributed systems. In a game theoretically fair leader election protocol, roughly speaking, we want that even a majority coalition cannot hurt the chance of any honest individual (maximin fairness). The folklore tournament protocol in $\Theta(\log(n))$ rounds, with $n$ being the number of nodes, is utilized to implement the maximin-fair leader selection among the worker nodes in the network. Our research shows that $\Theta(\log(n))$ represents the lower bound for round complexity under the commit-reveal framework against computationally bounded malicious adversaries. Extensive empirical studies on real-world data demonstrate that the AART model, combined with distributed training, results in improved accuracy and computational efficiency when compared to state-of-the-art competitors. This study contributes to the research on developing fair and communication-efficient distributed systems for machine learning tasks with promising practical applications in various domains.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshMachine learning-
dc.subject.lcshStock price forecasting - Computer simulation-
dc.titleMemory and communication efficient protocols for machine learning applications-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineComputer Science-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2023-
dc.identifier.mmsid991044705905503414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats