File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Effective and efficient community search over large attributed graphs
Title | Effective and efficient community search over large attributed graphs |
---|---|
Authors | |
Advisors | Advisor(s):Cheng, CK |
Issue Date | 2017 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Fang, Y. [方一向]. (2017). Effective and efficient community search over large attributed graphs. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Communities, which are prevalent in attributed graphs such as social networks and knowledge bases, can be used in emerging applications such as product advertisement and setting up of social events. Given a graph G and a vertex q ∈ G, the community search query returns a subgraph of G that contains vertices related to q. In this thesis, we study community search over three common attributed graphs, where (1) vertices are associated with keywords; (2) vertices are augmented with location information; and (3) edges are directed. Our goal is to study effective and efficient algorithms for finding communities over these graphs.
For keyword-based attributed graphs, we investigate the keyword-based attributed community query (or KAC query), which returns a KAC for a query vertex. A KAC satisfies both structure cohesiveness (i.e., its vertices are tightly connected) and keyword cohesiveness (i.e., its vertices share common keywords). The KAC query enables a better understanding of how and why a community is formed (e.g., members of a KAC have a common interest in music, because they all have the same keyword ``music''). To enable efficient KAC search, we develop the CL-tree index and three fast algorithms. We evaluate our solutions on six large graphs. Our results show that KAC search is more effective and efficient than existing community retrieval approaches. Moreover, a KAC contains more precise and personalized information than those of existing methods.
For spatial-based attributed graphs, we aim to find the spatial-aware community (or SAC), whose vertices are close structurally and spatially, for a query vertex in an online manner. To enable efficient SAC search, we develop two exact and three approximation algorithms. We evaluate SAC search on both large real and synthetic datasets. The experimental results show that SACs achieve higher effectiveness than communities returned by existing community retrieval solutions. Moreover, our solutions are faster than the baseline approaches.
For directed graphs, we study the problem of finding a community, in which each vertex has some in-neighbors and out-neighbors. We call this problem community search on directed graph (or CSD problem). Given a directed graph, our goal is to find a dense subgraph of a query vertex, in which each vertex has both high in-degree and out-degree. To solve the CSD problem, we propose the concept of D-core number and develop index-based algorithms. The experimental results on real large graphs show that our solutions are able to find better communities than existing solutions. Moreover, our algorithms are efficient and scalable.
We further develop the C-Explorer system to assist users in extracting, visualizing, and analyzing KACs. C-Explorer provides online and interactive community retrieval facilities, allowing a user to view her interesting communities for a query vertex. Moreover, C-Explorer implements several state-of-the-art community retrieval algorithms, as well as functions for analyzing their effectiveness. |
Degree | Doctor of Philosophy |
Subject | Graph algorithms Data mining |
Dept/Program | Computer Science |
Persistent Identifier | http://hdl.handle.net/10722/250721 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Cheng, CK | - |
dc.contributor.author | Fang, Yixiang | - |
dc.contributor.author | 方一向 | - |
dc.date.accessioned | 2018-01-26T01:59:22Z | - |
dc.date.available | 2018-01-26T01:59:22Z | - |
dc.date.issued | 2017 | - |
dc.identifier.citation | Fang, Y. [方一向]. (2017). Effective and efficient community search over large attributed graphs. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/250721 | - |
dc.description.abstract | Communities, which are prevalent in attributed graphs such as social networks and knowledge bases, can be used in emerging applications such as product advertisement and setting up of social events. Given a graph G and a vertex q ∈ G, the community search query returns a subgraph of G that contains vertices related to q. In this thesis, we study community search over three common attributed graphs, where (1) vertices are associated with keywords; (2) vertices are augmented with location information; and (3) edges are directed. Our goal is to study effective and efficient algorithms for finding communities over these graphs. For keyword-based attributed graphs, we investigate the keyword-based attributed community query (or KAC query), which returns a KAC for a query vertex. A KAC satisfies both structure cohesiveness (i.e., its vertices are tightly connected) and keyword cohesiveness (i.e., its vertices share common keywords). The KAC query enables a better understanding of how and why a community is formed (e.g., members of a KAC have a common interest in music, because they all have the same keyword ``music''). To enable efficient KAC search, we develop the CL-tree index and three fast algorithms. We evaluate our solutions on six large graphs. Our results show that KAC search is more effective and efficient than existing community retrieval approaches. Moreover, a KAC contains more precise and personalized information than those of existing methods. For spatial-based attributed graphs, we aim to find the spatial-aware community (or SAC), whose vertices are close structurally and spatially, for a query vertex in an online manner. To enable efficient SAC search, we develop two exact and three approximation algorithms. We evaluate SAC search on both large real and synthetic datasets. The experimental results show that SACs achieve higher effectiveness than communities returned by existing community retrieval solutions. Moreover, our solutions are faster than the baseline approaches. For directed graphs, we study the problem of finding a community, in which each vertex has some in-neighbors and out-neighbors. We call this problem community search on directed graph (or CSD problem). Given a directed graph, our goal is to find a dense subgraph of a query vertex, in which each vertex has both high in-degree and out-degree. To solve the CSD problem, we propose the concept of D-core number and develop index-based algorithms. The experimental results on real large graphs show that our solutions are able to find better communities than existing solutions. Moreover, our algorithms are efficient and scalable. We further develop the C-Explorer system to assist users in extracting, visualizing, and analyzing KACs. C-Explorer provides online and interactive community retrieval facilities, allowing a user to view her interesting communities for a query vertex. Moreover, C-Explorer implements several state-of-the-art community retrieval algorithms, as well as functions for analyzing their effectiveness. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Graph algorithms | - |
dc.subject.lcsh | Data mining | - |
dc.title | Effective and efficient community search over large attributed graphs | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Computer Science | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.5353/th_991043979532303414 | - |
dc.date.hkucongregation | 2017 | - |
dc.identifier.mmsid | 991043979532303414 | - |