File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Memory and communication efficient protocols for machine learning applications
Title | Memory and communication efficient protocols for machine learning applications |
---|---|
Authors | |
Advisors | Advisor(s):Chan, HTH |
Issue Date | 2023 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Wen, T. [温婷]. (2023). Memory and communication efficient protocols for machine learning applications. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Machine learning has been widely used in real-life applications. In this thesis, we study memory and communication efficient protocols for a challenging machine learning task, stock price prediction.
First, we propose the Adaptive Aggregated-Relational Transformer (AART) model, which
can achieve the following features:(1) capturing correlation between prices of different stocks, (2) discovering long-term trends in the extensive history of stocks, (3) recognizing short-term patterns within a finer time scale. The inter-stock effect is extracted by aggregating neighbors' information in Transformer Encoder. Furthermore, a data compression method is introduced to extend historical data while ensuring computational efficiency, and one-dimensional convolution is used to capture short-term patterns in historical data.
To accelerate training, a centralized distributed system is applied that employs the Adam optimizer with top-k sparsification, adaptive learning rates, and error feedback mechanisms. The combination of sparsification and error feedback techniques can significantly reduce computational cost without compromising the accuracy of the generated model, making it an effective method for training complex neural networks.
Furthermore, we consider selecting a leader among the worker nodes to perform additional tasks as the central server in distributed systems. In a game theoretically fair leader election protocol, roughly speaking, we want that even a majority coalition cannot hurt the chance of any honest individual (maximin fairness). The folklore tournament protocol in $\Theta(\log(n))$ rounds, with $n$ being the number of nodes, is utilized to implement the maximin-fair leader selection among the worker nodes in the network. Our research shows that $\Theta(\log(n))$ represents the lower bound for round complexity under the commit-reveal framework against computationally bounded malicious adversaries.
Extensive empirical studies on real-world data demonstrate that the AART model, combined with distributed training, results in improved accuracy and computational efficiency when compared to state-of-the-art competitors. This study contributes to the research on developing fair and communication-efficient distributed systems for machine learning tasks with promising practical applications in various domains. |
Degree | Doctor of Philosophy |
Subject | Machine learning Stock price forecasting - Computer simulation |
Dept/Program | Computer Science |
Persistent Identifier | http://hdl.handle.net/10722/328918 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Chan, HTH | - |
dc.contributor.author | Wen, Ting | - |
dc.contributor.author | 温婷 | - |
dc.date.accessioned | 2023-08-01T06:48:14Z | - |
dc.date.available | 2023-08-01T06:48:14Z | - |
dc.date.issued | 2023 | - |
dc.identifier.citation | Wen, T. [温婷]. (2023). Memory and communication efficient protocols for machine learning applications. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/328918 | - |
dc.description.abstract | Machine learning has been widely used in real-life applications. In this thesis, we study memory and communication efficient protocols for a challenging machine learning task, stock price prediction. First, we propose the Adaptive Aggregated-Relational Transformer (AART) model, which can achieve the following features:(1) capturing correlation between prices of different stocks, (2) discovering long-term trends in the extensive history of stocks, (3) recognizing short-term patterns within a finer time scale. The inter-stock effect is extracted by aggregating neighbors' information in Transformer Encoder. Furthermore, a data compression method is introduced to extend historical data while ensuring computational efficiency, and one-dimensional convolution is used to capture short-term patterns in historical data. To accelerate training, a centralized distributed system is applied that employs the Adam optimizer with top-k sparsification, adaptive learning rates, and error feedback mechanisms. The combination of sparsification and error feedback techniques can significantly reduce computational cost without compromising the accuracy of the generated model, making it an effective method for training complex neural networks. Furthermore, we consider selecting a leader among the worker nodes to perform additional tasks as the central server in distributed systems. In a game theoretically fair leader election protocol, roughly speaking, we want that even a majority coalition cannot hurt the chance of any honest individual (maximin fairness). The folklore tournament protocol in $\Theta(\log(n))$ rounds, with $n$ being the number of nodes, is utilized to implement the maximin-fair leader selection among the worker nodes in the network. Our research shows that $\Theta(\log(n))$ represents the lower bound for round complexity under the commit-reveal framework against computationally bounded malicious adversaries. Extensive empirical studies on real-world data demonstrate that the AART model, combined with distributed training, results in improved accuracy and computational efficiency when compared to state-of-the-art competitors. This study contributes to the research on developing fair and communication-efficient distributed systems for machine learning tasks with promising practical applications in various domains. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Machine learning | - |
dc.subject.lcsh | Stock price forecasting - Computer simulation | - |
dc.title | Memory and communication efficient protocols for machine learning applications | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Computer Science | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2023 | - |
dc.identifier.mmsid | 991044705905503414 | - |