File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

postgraduate thesis: Algorithms for evolving graph analysis

TitleAlgorithms for evolving graph analysis
Authors
Advisors
Issue Date2014
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Ren, C. [任成會]. (2014). Algorithms for evolving graph analysis. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5185948
AbstractIn many applications, entities and their relationships are represented by graphs. Examples include social networks (users and friendship), the WWW (web pages and hyperlinks) and bibliographic networks (authors and co-authorship). In a dynamic world, information changes and so the graphs representing the information evolve with time. For example, a Facebook link between two friends is established, or a hyperlink is added to a web page. We propose that historical graph-structured data be archived for analytical processing. We call a historical evolving graph sequence an EGS. We study the problem of efficient query processing on an EGS, which finds many applications that lead to interesting evolving graph analysis. To solve the problem, we propose a solution framework called FVF and a cluster-based LU decomposition algorithm called CLUDE, which can evaluate queries efficiently to support EGS analysis. The Find-Verify-and-Fix (FVF) framework applies to a wide range of queries. We demonstrate how some important graph measures, including shortest-path distance, closeness centrality and graph centrality, can be efficiently computed from EGSs using FVF. Since an EGS generally contains numerous large graphs, we also discuss several compact storage models that support our FVF framework. Through extensive experiments on both real and synthetic datasets, we show that our FVF framework is highly efficient in EGS query processing. A graph can be conveniently modeled by a matrix from which various quantitative measures are derived like PageRank and SALSA and Personalized PageRank and Random Walk with Restart. To compute these measures, linear systems of the form Ax = b, where A is a matrix that captures a graph's structure, need to be solved. To facilitate solving the linear system, the matrix A is often decomposed into two triangular matrices (L and U). In a dynamic world, the graph that models it changes with time and thus is the matrix A that represents the graph. We consider a sequence of evolving graphs and its associated sequence of evolving matrices. We study how LU-decomposition should be done over the sequence so that (1) the decomposition is efficient and (2) the resulting LU matrices best preserve the sparsity of the matrices A's (i.e., the number of extra non-zero entries introduced in L and U are minimized). We propose a cluster-based algorithm CLUDE for solving the problem. Through an experimental study, we show that CLUDE is about an order of magnitude faster than the traditional incremental update algorithm. The number of extra non-zero entries introduced by CLUDE is also about an order of magnitude fewer than that of the traditional algorithm. CLUDE is thus an efficient algorithm for LU decomposition that produces high-quality LU matrices over an evolving matrix sequence.
DegreeDoctor of Philosophy
SubjectGraph theory - Data processing
Dept/ProgramComputer Science
Persistent Identifierhttp://hdl.handle.net/10722/197105

 

DC FieldValueLanguage
dc.contributor.advisorKao, CM-
dc.contributor.advisorCheung, DWL-
dc.contributor.authorRen, Chenghui-
dc.contributor.author任成會-
dc.date.accessioned2014-05-07T23:15:27Z-
dc.date.available2014-05-07T23:15:27Z-
dc.date.issued2014-
dc.identifier.citationRen, C. [任成會]. (2014). Algorithms for evolving graph analysis. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5185948-
dc.identifier.urihttp://hdl.handle.net/10722/197105-
dc.description.abstractIn many applications, entities and their relationships are represented by graphs. Examples include social networks (users and friendship), the WWW (web pages and hyperlinks) and bibliographic networks (authors and co-authorship). In a dynamic world, information changes and so the graphs representing the information evolve with time. For example, a Facebook link between two friends is established, or a hyperlink is added to a web page. We propose that historical graph-structured data be archived for analytical processing. We call a historical evolving graph sequence an EGS. We study the problem of efficient query processing on an EGS, which finds many applications that lead to interesting evolving graph analysis. To solve the problem, we propose a solution framework called FVF and a cluster-based LU decomposition algorithm called CLUDE, which can evaluate queries efficiently to support EGS analysis. The Find-Verify-and-Fix (FVF) framework applies to a wide range of queries. We demonstrate how some important graph measures, including shortest-path distance, closeness centrality and graph centrality, can be efficiently computed from EGSs using FVF. Since an EGS generally contains numerous large graphs, we also discuss several compact storage models that support our FVF framework. Through extensive experiments on both real and synthetic datasets, we show that our FVF framework is highly efficient in EGS query processing. A graph can be conveniently modeled by a matrix from which various quantitative measures are derived like PageRank and SALSA and Personalized PageRank and Random Walk with Restart. To compute these measures, linear systems of the form Ax = b, where A is a matrix that captures a graph's structure, need to be solved. To facilitate solving the linear system, the matrix A is often decomposed into two triangular matrices (L and U). In a dynamic world, the graph that models it changes with time and thus is the matrix A that represents the graph. We consider a sequence of evolving graphs and its associated sequence of evolving matrices. We study how LU-decomposition should be done over the sequence so that (1) the decomposition is efficient and (2) the resulting LU matrices best preserve the sparsity of the matrices A's (i.e., the number of extra non-zero entries introduced in L and U are minimized). We propose a cluster-based algorithm CLUDE for solving the problem. Through an experimental study, we show that CLUDE is about an order of magnitude faster than the traditional incremental update algorithm. The number of extra non-zero entries introduced by CLUDE is also about an order of magnitude fewer than that of the traditional algorithm. CLUDE is thus an efficient algorithm for LU decomposition that produces high-quality LU matrices over an evolving matrix sequence.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsCreative Commons: Attribution 3.0 Hong Kong License-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.subject.lcshGraph theory - Data processing-
dc.titleAlgorithms for evolving graph analysis-
dc.typePG_Thesis-
dc.identifier.hkulb5185948-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineComputer Science-
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.5353/th_b5185948-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats