File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Exploiting tensor networks for efficient machine learning
Title | Exploiting tensor networks for efficient machine learning |
---|---|
Authors | |
Advisors | Advisor(s):Wong, N |
Issue Date | 2021 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Chen, C. [陳琮]. (2021). Exploiting tensor networks for efficient machine learning. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Many real-world data appear naturally in matrix or tensor format. For example, a grayscale picture is a 2-way tensor (i.e., a matrix), a color image or a grayscale video is naturally a 3-way tensor, and a color video can be regarded as a 4-way tensor. etc. However, most of the conventional machine learning algorithms are vector-based and cannot handle the tensorial data directly. The most common way to deal with this is by vectorizing those tensorial data, therefore leading to high-dimensional vectors. Such an operation would lose data structure information which has been proved to be useful in many machine learning tasks. Therefore, extending vector-based machine learning methods into their tensorial format is highly desired. On the other hand, the need for on-device machine learning arises in cases where decisions based on data processing have to be made immediately. In this case, the computing task must be accomplished with limited resource supply, such as computing time, storage space, battery power, etc. Therefore, it is preferred to compress a machine learning model before deploying it on edge devices. In summary, to achieve efficient machine learning, this thesis explores the tensorization and compression of machine learning models. Specifically, this thesis first considers two typical vector-based machine learning models, namely linear support vector machines (SVMs) and restricted Boltzmann machines (RBMs). The main parameters in those two models are stored in a vector and matrix, respectively. To tensorize these two models, the weight vector and matrix are represented as low-rank tensor networks, which reduces the parameter number dramatically and therefore alleviates the overfitting issue when the training sample size is small. Second, apart from the above-mentioned linear SVMs, this thesis further investigates kernel tricks on the tensorial extension of linear SVMs considering most of the real-life data are linearly inseparable. The introduced kernel trick is designed for tensor train (a kind of tensor network) format data and it is possible to apply different kernel functions on different data modes. Third, the compression of sum-product networks (SPNs) is investigated. SPNs constitute an emerging class of neural networks with clear probabilistic semantics and superior inference speed over other graphical models. This thesis reveals an important connection between SPNs and tensor trains. And transforming an SPN into a tensor train allows the inherent sharing of originally distributed weights in the SPN tree, thereby leading to an often dramatic reduction of the number of network parameters with little or negligible loss of modeling accuracy. Fourth, this thesis proposes a LiteGT, which aims to reduce the computation and storage complexity of the vanilla graph transformer model. Experiments demonstrate that the proposed LiteGT model reduces more than 100$\times$ computation and halves the model size without performance degradation. (Total words: 447) |
Degree | Doctor of Philosophy |
Subject | Tensor products Machine learning |
Dept/Program | Electrical and Electronic Engineering |
Persistent Identifier | http://hdl.handle.net/10722/308618 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Wong, N | - |
dc.contributor.author | Chen, Cong | - |
dc.contributor.author | 陳琮 | - |
dc.date.accessioned | 2021-12-06T01:04:00Z | - |
dc.date.available | 2021-12-06T01:04:00Z | - |
dc.date.issued | 2021 | - |
dc.identifier.citation | Chen, C. [陳琮]. (2021). Exploiting tensor networks for efficient machine learning. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/308618 | - |
dc.description.abstract | Many real-world data appear naturally in matrix or tensor format. For example, a grayscale picture is a 2-way tensor (i.e., a matrix), a color image or a grayscale video is naturally a 3-way tensor, and a color video can be regarded as a 4-way tensor. etc. However, most of the conventional machine learning algorithms are vector-based and cannot handle the tensorial data directly. The most common way to deal with this is by vectorizing those tensorial data, therefore leading to high-dimensional vectors. Such an operation would lose data structure information which has been proved to be useful in many machine learning tasks. Therefore, extending vector-based machine learning methods into their tensorial format is highly desired. On the other hand, the need for on-device machine learning arises in cases where decisions based on data processing have to be made immediately. In this case, the computing task must be accomplished with limited resource supply, such as computing time, storage space, battery power, etc. Therefore, it is preferred to compress a machine learning model before deploying it on edge devices. In summary, to achieve efficient machine learning, this thesis explores the tensorization and compression of machine learning models. Specifically, this thesis first considers two typical vector-based machine learning models, namely linear support vector machines (SVMs) and restricted Boltzmann machines (RBMs). The main parameters in those two models are stored in a vector and matrix, respectively. To tensorize these two models, the weight vector and matrix are represented as low-rank tensor networks, which reduces the parameter number dramatically and therefore alleviates the overfitting issue when the training sample size is small. Second, apart from the above-mentioned linear SVMs, this thesis further investigates kernel tricks on the tensorial extension of linear SVMs considering most of the real-life data are linearly inseparable. The introduced kernel trick is designed for tensor train (a kind of tensor network) format data and it is possible to apply different kernel functions on different data modes. Third, the compression of sum-product networks (SPNs) is investigated. SPNs constitute an emerging class of neural networks with clear probabilistic semantics and superior inference speed over other graphical models. This thesis reveals an important connection between SPNs and tensor trains. And transforming an SPN into a tensor train allows the inherent sharing of originally distributed weights in the SPN tree, thereby leading to an often dramatic reduction of the number of network parameters with little or negligible loss of modeling accuracy. Fourth, this thesis proposes a LiteGT, which aims to reduce the computation and storage complexity of the vanilla graph transformer model. Experiments demonstrate that the proposed LiteGT model reduces more than 100$\times$ computation and halves the model size without performance degradation. (Total words: 447) | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Tensor products | - |
dc.subject.lcsh | Machine learning | - |
dc.title | Exploiting tensor networks for efficient machine learning | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Electrical and Electronic Engineering | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2021 | - |
dc.identifier.mmsid | 991044448910503414 | - |