Exploiting tensor networks for efficient machine learning

Chen, Cong; 陳琮

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Electrical & Electronic Engineering: Theses

postgraduate thesis: Exploiting tensor networks for efficient machine learning

Title	Exploiting tensor networks for efficient machine learning
Authors	Chen, Cong 陳琮
Advisors	Advisor(s):Wong, N
Issue Date	2021
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Chen, C. [陳琮]. (2021). Exploiting tensor networks for efficient machine learning. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Many real-world data appear naturally in matrix or tensor format. For example, a grayscale picture is a 2-way tensor (i.e., a matrix), a color image or a grayscale video is naturally a 3-way tensor, and a color video can be regarded as a 4-way tensor. etc. However, most of the conventional machine learning algorithms are vector-based and cannot handle the tensorial data directly. The most common way to deal with this is by vectorizing those tensorial data, therefore leading to high-dimensional vectors. Such an operation would lose data structure information which has been proved to be useful in many machine learning tasks. Therefore, extending vector-based machine learning methods into their tensorial format is highly desired. On the other hand, the need for on-device machine learning arises in cases where decisions based on data processing have to be made immediately. In this case, the computing task must be accomplished with limited resource supply, such as computing time, storage space, battery power, etc. Therefore, it is preferred to compress a machine learning model before deploying it on edge devices. In summary, to achieve efficient machine learning, this thesis explores the tensorization and compression of machine learning models. Specifically, this thesis first considers two typical vector-based machine learning models, namely linear support vector machines (SVMs) and restricted Boltzmann machines (RBMs). The main parameters in those two models are stored in a vector and matrix, respectively. To tensorize these two models, the weight vector and matrix are represented as low-rank tensor networks, which reduces the parameter number dramatically and therefore alleviates the overfitting issue when the training sample size is small. Second, apart from the above-mentioned linear SVMs, this thesis further investigates kernel tricks on the tensorial extension of linear SVMs considering most of the real-life data are linearly inseparable. The introduced kernel trick is designed for tensor train (a kind of tensor network) format data and it is possible to apply different kernel functions on different data modes. Third, the compression of sum-product networks (SPNs) is investigated. SPNs constitute an emerging class of neural networks with clear probabilistic semantics and superior inference speed over other graphical models. This thesis reveals an important connection between SPNs and tensor trains. And transforming an SPN into a tensor train allows the inherent sharing of originally distributed weights in the SPN tree, thereby leading to an often dramatic reduction of the number of network parameters with little or negligible loss of modeling accuracy. Fourth, this thesis proposes a LiteGT, which aims to reduce the computation and storage complexity of the vanilla graph transformer model. Experiments demonstrate that the proposed LiteGT model reduces more than 100$\times$ computation and halves the model size without performance degradation. (Total words: 447)
Degree	Doctor of Philosophy
Subject	Tensor products Machine learning
Dept/Program	Electrical and Electronic Engineering
Persistent Identifier	http://hdl.handle.net/10722/308618

DC Field	Value	Language
dc.contributor.advisor	Wong, N	-
dc.contributor.author	Chen, Cong	-
dc.contributor.author	陳琮	-
dc.date.accessioned	2021-12-06T01:04:00Z	-
dc.date.available	2021-12-06T01:04:00Z	-
dc.date.issued	2021	-
dc.identifier.citation	Chen, C. [陳琮]. (2021). Exploiting tensor networks for efficient machine learning. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/308618	-
dc.description.abstract	Many real-world data appear naturally in matrix or tensor format. For example, a grayscale picture is a 2-way tensor (i.e., a matrix), a color image or a grayscale video is naturally a 3-way tensor, and a color video can be regarded as a 4-way tensor. etc. However, most of the conventional machine learning algorithms are vector-based and cannot handle the tensorial data directly. The most common way to deal with this is by vectorizing those tensorial data, therefore leading to high-dimensional vectors. Such an operation would lose data structure information which has been proved to be useful in many machine learning tasks. Therefore, extending vector-based machine learning methods into their tensorial format is highly desired. On the other hand, the need for on-device machine learning arises in cases where decisions based on data processing have to be made immediately. In this case, the computing task must be accomplished with limited resource supply, such as computing time, storage space, battery power, etc. Therefore, it is preferred to compress a machine learning model before deploying it on edge devices. In summary, to achieve efficient machine learning, this thesis explores the tensorization and compression of machine learning models. Specifically, this thesis first considers two typical vector-based machine learning models, namely linear support vector machines (SVMs) and restricted Boltzmann machines (RBMs). The main parameters in those two models are stored in a vector and matrix, respectively. To tensorize these two models, the weight vector and matrix are represented as low-rank tensor networks, which reduces the parameter number dramatically and therefore alleviates the overfitting issue when the training sample size is small. Second, apart from the above-mentioned linear SVMs, this thesis further investigates kernel tricks on the tensorial extension of linear SVMs considering most of the real-life data are linearly inseparable. The introduced kernel trick is designed for tensor train (a kind of tensor network) format data and it is possible to apply different kernel functions on different data modes. Third, the compression of sum-product networks (SPNs) is investigated. SPNs constitute an emerging class of neural networks with clear probabilistic semantics and superior inference speed over other graphical models. This thesis reveals an important connection between SPNs and tensor trains. And transforming an SPN into a tensor train allows the inherent sharing of originally distributed weights in the SPN tree, thereby leading to an often dramatic reduction of the number of network parameters with little or negligible loss of modeling accuracy. Fourth, this thesis proposes a LiteGT, which aims to reduce the computation and storage complexity of the vanilla graph transformer model. Experiments demonstrate that the proposed LiteGT model reduces more than 100$\times$ computation and halves the model size without performance degradation. (Total words: 447)	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Tensor products	-
dc.subject.lcsh	Machine learning	-
dc.title	Exploiting tensor networks for efficient machine learning	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Electrical and Electronic Engineering	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2021	-
dc.identifier.mmsid	991044448910503414	-

File Download

Supplementary

postgraduate thesis: Exploiting tensor networks for efficient machine learning

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats