File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

postgraduate thesis: Graph partitioning problems through spectral analysis and heuristics

TitleGraph partitioning problems through spectral analysis and heuristics
Authors
Advisors
Advisor(s):Chan, HTH
Issue Date2017
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Zhang, C. [张晨孜]. (2017). Graph partitioning problems through spectral analysis and heuristics. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractIn this work, we study two graph partitioning problems that aim at dividing a given graph into balanced components and minimizing vertex or edge cut. To solve the edge expansion problem, we explore spectral analysis for directed graphs and hypergraphs. For edge partitioning problem, we derive a heuristic that outperforms existing methods on real-world graphs. We consider Cheeger Inequalities for general edge-weighted directed graphs. Previously the directed case was considered by Chung for a probability transition matrix corresponding to a strongly connected graph with weights induced by a stationary distribution. We extend Chung's approach to general directed graphs for the scenarios that the underlying graph is not strongly connected or the weights deviate (slightly) from a stationary distribution. We consider the existence and uniqueness of diffusion process for hypergraphs. There has been recent work [Louis STOC 2015] to analyze the spectral properties of hypergraphs using a diffusion process such that within each edge, measure flows from vertices having maximum weighted measure to those having minimum. We show that the measure flow between the above two sets of vertices must be coordinated carefully over different hyperedges in order for the diffusion process to be well-defined, from which a Laplacian can be uniquely determined. Since the Laplacian is nonlinear, we also exploit other properties of the diffusion process to recover a spectral property concerning the ``second eigenvalue'' of the resulting Laplacian. We consider the edge partitioning problem that partitions the edges of an input graph into multiple balanced components, while minimizing the total number of vertices replicated (a vertex might appear in more than one partition). This problem is crucial in minimizing communication costs and running time for several large-scale distributed graph computation platforms (e.g., PowerGraph, Spark GraphX). We present a new partitioning heuristic and prove a worst-case upper bound of replication factor on general graphs. Extensive experiments demonstrated that our partitioning algorithm consistently produces much smaller replication factors on various benchmark data sets than the state-of-the-art.
DegreeDoctor of Philosophy
SubjectGraph theory - Data processing
Heuristic algorithms
Dept/ProgramComputer Science
Persistent Identifierhttp://hdl.handle.net/10722/261443

 

DC FieldValueLanguage
dc.contributor.advisorChan, HTH-
dc.contributor.authorZhang, Chenzi-
dc.contributor.author张晨孜-
dc.date.accessioned2018-09-20T06:43:41Z-
dc.date.available2018-09-20T06:43:41Z-
dc.date.issued2017-
dc.identifier.citationZhang, C. [张晨孜]. (2017). Graph partitioning problems through spectral analysis and heuristics. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/261443-
dc.description.abstractIn this work, we study two graph partitioning problems that aim at dividing a given graph into balanced components and minimizing vertex or edge cut. To solve the edge expansion problem, we explore spectral analysis for directed graphs and hypergraphs. For edge partitioning problem, we derive a heuristic that outperforms existing methods on real-world graphs. We consider Cheeger Inequalities for general edge-weighted directed graphs. Previously the directed case was considered by Chung for a probability transition matrix corresponding to a strongly connected graph with weights induced by a stationary distribution. We extend Chung's approach to general directed graphs for the scenarios that the underlying graph is not strongly connected or the weights deviate (slightly) from a stationary distribution. We consider the existence and uniqueness of diffusion process for hypergraphs. There has been recent work [Louis STOC 2015] to analyze the spectral properties of hypergraphs using a diffusion process such that within each edge, measure flows from vertices having maximum weighted measure to those having minimum. We show that the measure flow between the above two sets of vertices must be coordinated carefully over different hyperedges in order for the diffusion process to be well-defined, from which a Laplacian can be uniquely determined. Since the Laplacian is nonlinear, we also exploit other properties of the diffusion process to recover a spectral property concerning the ``second eigenvalue'' of the resulting Laplacian. We consider the edge partitioning problem that partitions the edges of an input graph into multiple balanced components, while minimizing the total number of vertices replicated (a vertex might appear in more than one partition). This problem is crucial in minimizing communication costs and running time for several large-scale distributed graph computation platforms (e.g., PowerGraph, Spark GraphX). We present a new partitioning heuristic and prove a worst-case upper bound of replication factor on general graphs. Extensive experiments demonstrated that our partitioning algorithm consistently produces much smaller replication factors on various benchmark data sets than the state-of-the-art.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshGraph theory - Data processing-
dc.subject.lcshHeuristic algorithms-
dc.titleGraph partitioning problems through spectral analysis and heuristics-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineComputer Science-
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.5353/th_991044040580003414-
dc.date.hkucongregation2018-
dc.identifier.mmsid991044040580003414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats