File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Graph partitioning problems through spectral analysis and heuristics
Title | Graph partitioning problems through spectral analysis and heuristics |
---|---|
Authors | |
Advisors | Advisor(s):Chan, HTH |
Issue Date | 2017 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Zhang, C. [张晨孜]. (2017). Graph partitioning problems through spectral analysis and heuristics. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | In this work, we study two graph partitioning problems that aim at dividing a given graph into balanced components and minimizing vertex or edge cut. To solve the edge expansion problem, we explore spectral analysis for directed graphs and hypergraphs. For edge partitioning problem, we derive a heuristic that outperforms existing methods on real-world graphs.
We consider Cheeger Inequalities for general edge-weighted directed graphs. Previously the directed case was considered by Chung for a probability transition matrix corresponding to a strongly connected graph with weights induced by a stationary distribution. We extend Chung's approach to general directed graphs for the scenarios that the underlying graph is not strongly connected or the weights deviate (slightly) from a stationary distribution.
We consider the existence and uniqueness of diffusion process for hypergraphs. There has been recent work [Louis STOC 2015] to analyze the spectral properties of hypergraphs using a diffusion process such that within each edge, measure flows from vertices having maximum weighted measure to those having minimum. We show that the measure flow between the above two sets of vertices must be coordinated carefully over different hyperedges in order for the diffusion process to be well-defined, from which a Laplacian can be uniquely determined. Since the Laplacian is nonlinear, we also exploit other properties of the diffusion process to recover a spectral property concerning the ``second eigenvalue'' of the resulting Laplacian.
We consider the edge partitioning problem that partitions the edges of an input graph into multiple balanced components, while minimizing the total number of vertices replicated (a vertex might appear in more than one partition). This problem is crucial in minimizing communication costs and running time for several large-scale distributed graph computation platforms (e.g., PowerGraph, Spark GraphX). We present a new partitioning heuristic and prove a worst-case upper bound of replication factor on general graphs. Extensive experiments demonstrated that our partitioning algorithm consistently produces much smaller replication factors on various benchmark data sets than the state-of-the-art. |
Degree | Doctor of Philosophy |
Subject | Graph theory - Data processing Heuristic algorithms |
Dept/Program | Computer Science |
Persistent Identifier | http://hdl.handle.net/10722/261443 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Chan, HTH | - |
dc.contributor.author | Zhang, Chenzi | - |
dc.contributor.author | 张晨孜 | - |
dc.date.accessioned | 2018-09-20T06:43:41Z | - |
dc.date.available | 2018-09-20T06:43:41Z | - |
dc.date.issued | 2017 | - |
dc.identifier.citation | Zhang, C. [张晨孜]. (2017). Graph partitioning problems through spectral analysis and heuristics. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/261443 | - |
dc.description.abstract | In this work, we study two graph partitioning problems that aim at dividing a given graph into balanced components and minimizing vertex or edge cut. To solve the edge expansion problem, we explore spectral analysis for directed graphs and hypergraphs. For edge partitioning problem, we derive a heuristic that outperforms existing methods on real-world graphs. We consider Cheeger Inequalities for general edge-weighted directed graphs. Previously the directed case was considered by Chung for a probability transition matrix corresponding to a strongly connected graph with weights induced by a stationary distribution. We extend Chung's approach to general directed graphs for the scenarios that the underlying graph is not strongly connected or the weights deviate (slightly) from a stationary distribution. We consider the existence and uniqueness of diffusion process for hypergraphs. There has been recent work [Louis STOC 2015] to analyze the spectral properties of hypergraphs using a diffusion process such that within each edge, measure flows from vertices having maximum weighted measure to those having minimum. We show that the measure flow between the above two sets of vertices must be coordinated carefully over different hyperedges in order for the diffusion process to be well-defined, from which a Laplacian can be uniquely determined. Since the Laplacian is nonlinear, we also exploit other properties of the diffusion process to recover a spectral property concerning the ``second eigenvalue'' of the resulting Laplacian. We consider the edge partitioning problem that partitions the edges of an input graph into multiple balanced components, while minimizing the total number of vertices replicated (a vertex might appear in more than one partition). This problem is crucial in minimizing communication costs and running time for several large-scale distributed graph computation platforms (e.g., PowerGraph, Spark GraphX). We present a new partitioning heuristic and prove a worst-case upper bound of replication factor on general graphs. Extensive experiments demonstrated that our partitioning algorithm consistently produces much smaller replication factors on various benchmark data sets than the state-of-the-art. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Graph theory - Data processing | - |
dc.subject.lcsh | Heuristic algorithms | - |
dc.title | Graph partitioning problems through spectral analysis and heuristics | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Computer Science | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.5353/th_991044040580003414 | - |
dc.date.hkucongregation | 2018 | - |
dc.identifier.mmsid | 991044040580003414 | - |