Algorithms for evolving graph analysis

Ren, Chenghui; 任成會

File Download

FullText.pdf

Links for fulltext

(May Require Subscription)

DOI: 10.5353/th_b5185948

Supplementary

Citations:
Appears in Collections:
- Computer Science: Theses
- HKU Theses Online

postgraduate thesis: Algorithms for evolving graph analysis

Title	Algorithms for evolving graph analysis
Authors	Ren, Chenghui 任成會
Advisors	Advisor(s):Kao, CM Cheung, DWL
Issue Date	2014
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Ren, C. [任成會]. (2014). Algorithms for evolving graph analysis. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5185948
Abstract	In many applications, entities and their relationships are represented by graphs. Examples include social networks (users and friendship), the WWW (web pages and hyperlinks) and bibliographic networks (authors and co-authorship). In a dynamic world, information changes and so the graphs representing the information evolve with time. For example, a Facebook link between two friends is established, or a hyperlink is added to a web page. We propose that historical graph-structured data be archived for analytical processing. We call a historical evolving graph sequence an EGS. We study the problem of efficient query processing on an EGS, which finds many applications that lead to interesting evolving graph analysis. To solve the problem, we propose a solution framework called FVF and a cluster-based LU decomposition algorithm called CLUDE, which can evaluate queries efficiently to support EGS analysis. The Find-Verify-and-Fix (FVF) framework applies to a wide range of queries. We demonstrate how some important graph measures, including shortest-path distance, closeness centrality and graph centrality, can be efficiently computed from EGSs using FVF. Since an EGS generally contains numerous large graphs, we also discuss several compact storage models that support our FVF framework. Through extensive experiments on both real and synthetic datasets, we show that our FVF framework is highly efficient in EGS query processing. A graph can be conveniently modeled by a matrix from which various quantitative measures are derived like PageRank and SALSA and Personalized PageRank and Random Walk with Restart. To compute these measures, linear systems of the form Ax = b, where A is a matrix that captures a graph's structure, need to be solved. To facilitate solving the linear system, the matrix A is often decomposed into two triangular matrices (L and U). In a dynamic world, the graph that models it changes with time and thus is the matrix A that represents the graph. We consider a sequence of evolving graphs and its associated sequence of evolving matrices. We study how LU-decomposition should be done over the sequence so that (1) the decomposition is efficient and (2) the resulting LU matrices best preserve the sparsity of the matrices A's (i.e., the number of extra non-zero entries introduced in L and U are minimized). We propose a cluster-based algorithm CLUDE for solving the problem. Through an experimental study, we show that CLUDE is about an order of magnitude faster than the traditional incremental update algorithm. The number of extra non-zero entries introduced by CLUDE is also about an order of magnitude fewer than that of the traditional algorithm. CLUDE is thus an efficient algorithm for LU decomposition that produces high-quality LU matrices over an evolving matrix sequence.
Degree	Doctor of Philosophy
Subject	Graph theory - Data processing
Dept/Program	Computer Science
Persistent Identifier	http://hdl.handle.net/10722/197105
HKU Library Item ID	b5185948

DC Field	Value	Language
dc.contributor.advisor	Kao, CM	-
dc.contributor.advisor	Cheung, DWL	-
dc.contributor.author	Ren, Chenghui	-
dc.contributor.author	任成會	-
dc.date.accessioned	2014-05-07T23:15:27Z	-
dc.date.available	2014-05-07T23:15:27Z	-
dc.date.issued	2014	-
dc.identifier.citation	Ren, C. [任成會]. (2014). Algorithms for evolving graph analysis. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5185948	-
dc.identifier.uri	http://hdl.handle.net/10722/197105	-
dc.description.abstract	In many applications, entities and their relationships are represented by graphs. Examples include social networks (users and friendship), the WWW (web pages and hyperlinks) and bibliographic networks (authors and co-authorship). In a dynamic world, information changes and so the graphs representing the information evolve with time. For example, a Facebook link between two friends is established, or a hyperlink is added to a web page. We propose that historical graph-structured data be archived for analytical processing. We call a historical evolving graph sequence an EGS. We study the problem of efficient query processing on an EGS, which finds many applications that lead to interesting evolving graph analysis. To solve the problem, we propose a solution framework called FVF and a cluster-based LU decomposition algorithm called CLUDE, which can evaluate queries efficiently to support EGS analysis. The Find-Verify-and-Fix (FVF) framework applies to a wide range of queries. We demonstrate how some important graph measures, including shortest-path distance, closeness centrality and graph centrality, can be efficiently computed from EGSs using FVF. Since an EGS generally contains numerous large graphs, we also discuss several compact storage models that support our FVF framework. Through extensive experiments on both real and synthetic datasets, we show that our FVF framework is highly efficient in EGS query processing. A graph can be conveniently modeled by a matrix from which various quantitative measures are derived like PageRank and SALSA and Personalized PageRank and Random Walk with Restart. To compute these measures, linear systems of the form Ax = b, where A is a matrix that captures a graph's structure, need to be solved. To facilitate solving the linear system, the matrix A is often decomposed into two triangular matrices (L and U). In a dynamic world, the graph that models it changes with time and thus is the matrix A that represents the graph. We consider a sequence of evolving graphs and its associated sequence of evolving matrices. We study how LU-decomposition should be done over the sequence so that (1) the decomposition is efficient and (2) the resulting LU matrices best preserve the sparsity of the matrices A's (i.e., the number of extra non-zero entries introduced in L and U are minimized). We propose a cluster-based algorithm CLUDE for solving the problem. Through an experimental study, we show that CLUDE is about an order of magnitude faster than the traditional incremental update algorithm. The number of extra non-zero entries introduced by CLUDE is also about an order of magnitude fewer than that of the traditional algorithm. CLUDE is thus an efficient algorithm for LU decomposition that produces high-quality LU matrices over an evolving matrix sequence.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.subject.lcsh	Graph theory - Data processing	-
dc.title	Algorithms for evolving graph analysis	-
dc.type	PG_Thesis	-
dc.identifier.hkul	b5185948	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Computer Science	-
dc.description.nature	published_or_final_version	-
dc.identifier.doi	10.5353/th_b5185948	-
dc.identifier.mmsid	991036819499703414	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

postgraduate thesis: Algorithms for evolving graph analysis

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats