File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Effective and efficient discovery of top-k meta path in heterogeneous information networks
Title | Effective and efficient discovery of top-k meta path in heterogeneous information networks |
---|---|
Authors | |
Advisors | Advisor(s):Cheng, CK |
Issue Date | 2019 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Zhu, Z. [朱子晨]. (2019). Effective and efficient discovery of top-k meta path in heterogeneous information networks. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | \textit{Heterogeneous information networks (HINs)}, which are typed graphs with labeled nodes and edges, have attracted tremendous interest from academia and industry. Given two HIN nodes $s$ and $t$, and a natural number $k$, we study the discovery of the $k$ most important paths in real time. The returned paths can be used to support friend search, product recommendation, anomaly detection, and graph clustering. Although related algorithms have been proposed before, they were primarily designed to return the $k$ shortest paths from unlabeled graphs. This leads to two problems: (1) there are often many shortest paths between $s$ and $t$, and so it is not easy to choose the $k$ best ones; and (2) it is arguable whether a shorter path implies a more crucial one. To address these issues, we study the {\it top-$k$ meta path query} for an HIN. A meta path is essentially a sequence of node and edge types between $s$ and $t$ (e.g., {\it Author}$\xrightarrow{write}${\it Paper}$\xrightarrow{written-by}${\it Author} denotes the coauthor relationship of $s$ and $t$). A meta path abstracts multiple path instances into a high-level path pattern, thereby giving more insight between two nodes. We further study several ranking functions that evaluate the \textit{importance} of meta paths based on \textit{frequency} and \textit{rarity}, rather than on path length. Also, we propose a solution that seamlessly integrates these functions into a multi-step searching framework. In addition, we combine bidirectional searching algorithm with this framework to further boost up the efficiency performance. The experiment on different datasets shows that the proposed method outperforms state-of-the-art algorithms in terms of effectiveness with reasonable response time. |
Degree | Master of Philosophy |
Subject | Information networks |
Dept/Program | Computer Science |
Persistent Identifier | http://hdl.handle.net/10722/278449 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Cheng, CK | - |
dc.contributor.author | Zhu, Zichen | - |
dc.contributor.author | 朱子晨 | - |
dc.date.accessioned | 2019-10-09T01:17:46Z | - |
dc.date.available | 2019-10-09T01:17:46Z | - |
dc.date.issued | 2019 | - |
dc.identifier.citation | Zhu, Z. [朱子晨]. (2019). Effective and efficient discovery of top-k meta path in heterogeneous information networks. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/278449 | - |
dc.description.abstract | \textit{Heterogeneous information networks (HINs)}, which are typed graphs with labeled nodes and edges, have attracted tremendous interest from academia and industry. Given two HIN nodes $s$ and $t$, and a natural number $k$, we study the discovery of the $k$ most important paths in real time. The returned paths can be used to support friend search, product recommendation, anomaly detection, and graph clustering. Although related algorithms have been proposed before, they were primarily designed to return the $k$ shortest paths from unlabeled graphs. This leads to two problems: (1) there are often many shortest paths between $s$ and $t$, and so it is not easy to choose the $k$ best ones; and (2) it is arguable whether a shorter path implies a more crucial one. To address these issues, we study the {\it top-$k$ meta path query} for an HIN. A meta path is essentially a sequence of node and edge types between $s$ and $t$ (e.g., {\it Author}$\xrightarrow{write}${\it Paper}$\xrightarrow{written-by}${\it Author} denotes the coauthor relationship of $s$ and $t$). A meta path abstracts multiple path instances into a high-level path pattern, thereby giving more insight between two nodes. We further study several ranking functions that evaluate the \textit{importance} of meta paths based on \textit{frequency} and \textit{rarity}, rather than on path length. Also, we propose a solution that seamlessly integrates these functions into a multi-step searching framework. In addition, we combine bidirectional searching algorithm with this framework to further boost up the efficiency performance. The experiment on different datasets shows that the proposed method outperforms state-of-the-art algorithms in terms of effectiveness with reasonable response time. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Information networks | - |
dc.title | Effective and efficient discovery of top-k meta path in heterogeneous information networks | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Master of Philosophy | - |
dc.description.thesislevel | Master | - |
dc.description.thesisdiscipline | Computer Science | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.5353/th_991044146578203414 | - |
dc.date.hkucongregation | 2019 | - |
dc.identifier.mmsid | 991044146578203414 | - |