File Download
Supplementary

postgraduate thesis: Probabilistic modeling for multi-dimensional data with automatic rank estimation

TitleProbabilistic modeling for multi-dimensional data with automatic rank estimation
Authors
Advisors
Advisor(s):Wu, YCWong, N
Issue Date2023
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Xu, L. [徐樂]. (2023). Probabilistic modeling for multi-dimensional data with automatic rank estimation. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractIn the era of big data, most data are in multi-dimensional formats, or commonly known as tensors. To fully make use of these data, their underlying structures shall be carefully explored and modeled. Among various kinds of tensor formats, the piece-wisely matrix product and the tensor train (TT) show special benefits in analyzing multi-dimensional data. The former finds its exact correspondence with the virtual dual-wideband channel model, and the latter has achieved tremendous success in machine learning tasks, including (but not limited to) visual data processing. In both tensor formats, the sparsity (or low-rankness), which indicates the number of essential components is small, plays an important role. With a well-defined sparsity level, methods based on these tensor formats are able to avoid noise overfitting. However, in most cases, the sparsity level, which further depends on the noise power, is generally unknown. To tackle this problem, the two tensor formats are dealt with from a probabilistic perspective, which includes automatic rank determination and noise power estimation. Especially, in this thesis, three research works are presented to leverage the power of probabilistic modeling in processing multi-dimensional data. The first work focuses on dual-wideband channel estimation, in which the channel in each sub-band can be formulated as matrix products. By exploring the common sparsity in the virtual channel model, a probabilistic model is built for the virtual augmented path losses, so that the components corresponding to the real channel information can be selected out. The second work solves the TT completion problem from a Bayesian perspective, by assigning a Gaussian-product-Gamma prior to each TT core element. Theoretical justification for adopting such a prior for inducing sparsity on the slices of the TT cores is provided, thus allowing the model complexity to be automatically determined. Furthermore, using the variational inference framework, an effective learning algorithm on the probabilistic model parameters is derived. In the third work, the task of visual data completion by TT is emphasized. To fully preserve the local information of the original visual data, a graph-based regularization is introduced in the TT completion problem. Apart from an optimization-based method, to avoid heavy parameter tuning, a sparsity-promoting probabilistic model is built based on the generalized inverse Gaussian (GIG) prior, and an inference algorithm is derived under the mean-field approximation. The superior performance of all the proposed methods is demonstrated by extensive experiments.
DegreeDoctor of Philosophy
SubjectTensor products
Dept/ProgramElectrical and Electronic Engineering
Persistent Identifierhttp://hdl.handle.net/10722/328933

 

DC FieldValueLanguage
dc.contributor.advisorWu, YC-
dc.contributor.advisorWong, N-
dc.contributor.authorXu, Le-
dc.contributor.author徐樂-
dc.date.accessioned2023-08-01T06:48:24Z-
dc.date.available2023-08-01T06:48:24Z-
dc.date.issued2023-
dc.identifier.citationXu, L. [徐樂]. (2023). Probabilistic modeling for multi-dimensional data with automatic rank estimation. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/328933-
dc.description.abstractIn the era of big data, most data are in multi-dimensional formats, or commonly known as tensors. To fully make use of these data, their underlying structures shall be carefully explored and modeled. Among various kinds of tensor formats, the piece-wisely matrix product and the tensor train (TT) show special benefits in analyzing multi-dimensional data. The former finds its exact correspondence with the virtual dual-wideband channel model, and the latter has achieved tremendous success in machine learning tasks, including (but not limited to) visual data processing. In both tensor formats, the sparsity (or low-rankness), which indicates the number of essential components is small, plays an important role. With a well-defined sparsity level, methods based on these tensor formats are able to avoid noise overfitting. However, in most cases, the sparsity level, which further depends on the noise power, is generally unknown. To tackle this problem, the two tensor formats are dealt with from a probabilistic perspective, which includes automatic rank determination and noise power estimation. Especially, in this thesis, three research works are presented to leverage the power of probabilistic modeling in processing multi-dimensional data. The first work focuses on dual-wideband channel estimation, in which the channel in each sub-band can be formulated as matrix products. By exploring the common sparsity in the virtual channel model, a probabilistic model is built for the virtual augmented path losses, so that the components corresponding to the real channel information can be selected out. The second work solves the TT completion problem from a Bayesian perspective, by assigning a Gaussian-product-Gamma prior to each TT core element. Theoretical justification for adopting such a prior for inducing sparsity on the slices of the TT cores is provided, thus allowing the model complexity to be automatically determined. Furthermore, using the variational inference framework, an effective learning algorithm on the probabilistic model parameters is derived. In the third work, the task of visual data completion by TT is emphasized. To fully preserve the local information of the original visual data, a graph-based regularization is introduced in the TT completion problem. Apart from an optimization-based method, to avoid heavy parameter tuning, a sparsity-promoting probabilistic model is built based on the generalized inverse Gaussian (GIG) prior, and an inference algorithm is derived under the mean-field approximation. The superior performance of all the proposed methods is demonstrated by extensive experiments.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshTensor products-
dc.titleProbabilistic modeling for multi-dimensional data with automatic rank estimation-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineElectrical and Electronic Engineering-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2023-
dc.identifier.mmsid991044705905603414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats