File Download
Supplementary

postgraduate thesis: Exploiting tensor networks for efficient machine learning

TitleExploiting tensor networks for efficient machine learning
Authors
Advisors
Advisor(s):Wong, N
Issue Date2021
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Chen, C. [陳琮]. (2021). Exploiting tensor networks for efficient machine learning. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractMany real-world data appear naturally in matrix or tensor format. For example, a grayscale picture is a 2-way tensor (i.e., a matrix), a color image or a grayscale video is naturally a 3-way tensor, and a color video can be regarded as a 4-way tensor. etc. However, most of the conventional machine learning algorithms are vector-based and cannot handle the tensorial data directly. The most common way to deal with this is by vectorizing those tensorial data, therefore leading to high-dimensional vectors. Such an operation would lose data structure information which has been proved to be useful in many machine learning tasks. Therefore, extending vector-based machine learning methods into their tensorial format is highly desired. On the other hand, the need for on-device machine learning arises in cases where decisions based on data processing have to be made immediately. In this case, the computing task must be accomplished with limited resource supply, such as computing time, storage space, battery power, etc. Therefore, it is preferred to compress a machine learning model before deploying it on edge devices. In summary, to achieve efficient machine learning, this thesis explores the tensorization and compression of machine learning models. Specifically, this thesis first considers two typical vector-based machine learning models, namely linear support vector machines (SVMs) and restricted Boltzmann machines (RBMs). The main parameters in those two models are stored in a vector and matrix, respectively. To tensorize these two models, the weight vector and matrix are represented as low-rank tensor networks, which reduces the parameter number dramatically and therefore alleviates the overfitting issue when the training sample size is small. Second, apart from the above-mentioned linear SVMs, this thesis further investigates kernel tricks on the tensorial extension of linear SVMs considering most of the real-life data are linearly inseparable. The introduced kernel trick is designed for tensor train (a kind of tensor network) format data and it is possible to apply different kernel functions on different data modes. Third, the compression of sum-product networks (SPNs) is investigated. SPNs constitute an emerging class of neural networks with clear probabilistic semantics and superior inference speed over other graphical models. This thesis reveals an important connection between SPNs and tensor trains. And transforming an SPN into a tensor train allows the inherent sharing of originally distributed weights in the SPN tree, thereby leading to an often dramatic reduction of the number of network parameters with little or negligible loss of modeling accuracy. Fourth, this thesis proposes a LiteGT, which aims to reduce the computation and storage complexity of the vanilla graph transformer model. Experiments demonstrate that the proposed LiteGT model reduces more than 100$\times$ computation and halves the model size without performance degradation. (Total words: 447)
DegreeDoctor of Philosophy
SubjectTensor products
Machine learning
Dept/ProgramElectrical and Electronic Engineering
Persistent Identifierhttp://hdl.handle.net/10722/308618

 

DC FieldValueLanguage
dc.contributor.advisorWong, N-
dc.contributor.authorChen, Cong-
dc.contributor.author陳琮-
dc.date.accessioned2021-12-06T01:04:00Z-
dc.date.available2021-12-06T01:04:00Z-
dc.date.issued2021-
dc.identifier.citationChen, C. [陳琮]. (2021). Exploiting tensor networks for efficient machine learning. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/308618-
dc.description.abstractMany real-world data appear naturally in matrix or tensor format. For example, a grayscale picture is a 2-way tensor (i.e., a matrix), a color image or a grayscale video is naturally a 3-way tensor, and a color video can be regarded as a 4-way tensor. etc. However, most of the conventional machine learning algorithms are vector-based and cannot handle the tensorial data directly. The most common way to deal with this is by vectorizing those tensorial data, therefore leading to high-dimensional vectors. Such an operation would lose data structure information which has been proved to be useful in many machine learning tasks. Therefore, extending vector-based machine learning methods into their tensorial format is highly desired. On the other hand, the need for on-device machine learning arises in cases where decisions based on data processing have to be made immediately. In this case, the computing task must be accomplished with limited resource supply, such as computing time, storage space, battery power, etc. Therefore, it is preferred to compress a machine learning model before deploying it on edge devices. In summary, to achieve efficient machine learning, this thesis explores the tensorization and compression of machine learning models. Specifically, this thesis first considers two typical vector-based machine learning models, namely linear support vector machines (SVMs) and restricted Boltzmann machines (RBMs). The main parameters in those two models are stored in a vector and matrix, respectively. To tensorize these two models, the weight vector and matrix are represented as low-rank tensor networks, which reduces the parameter number dramatically and therefore alleviates the overfitting issue when the training sample size is small. Second, apart from the above-mentioned linear SVMs, this thesis further investigates kernel tricks on the tensorial extension of linear SVMs considering most of the real-life data are linearly inseparable. The introduced kernel trick is designed for tensor train (a kind of tensor network) format data and it is possible to apply different kernel functions on different data modes. Third, the compression of sum-product networks (SPNs) is investigated. SPNs constitute an emerging class of neural networks with clear probabilistic semantics and superior inference speed over other graphical models. This thesis reveals an important connection between SPNs and tensor trains. And transforming an SPN into a tensor train allows the inherent sharing of originally distributed weights in the SPN tree, thereby leading to an often dramatic reduction of the number of network parameters with little or negligible loss of modeling accuracy. Fourth, this thesis proposes a LiteGT, which aims to reduce the computation and storage complexity of the vanilla graph transformer model. Experiments demonstrate that the proposed LiteGT model reduces more than 100$\times$ computation and halves the model size without performance degradation. (Total words: 447)-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshTensor products-
dc.subject.lcshMachine learning-
dc.titleExploiting tensor networks for efficient machine learning-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineElectrical and Electronic Engineering-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2021-
dc.identifier.mmsid991044448910503414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats