File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

postgraduate thesis: Evaluating multi-way joins over discounted hitting time

TitleEvaluating multi-way joins over discounted hitting time
Authors
Advisors
Advisor(s):Kao, CMCheng, CK
Issue Date2013
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Zhang, W. [张望达]. (2013). Evaluating multi-way joins over discounted hitting time. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5177353
AbstractThe prevalence of graphs in emerging applications has recently raised a lot of research interests. To acquire interesting information hidden in large graphs, tasks including link prediction, collaborative recommendation, and reputation ranking, all make use of proximities between graph nodes. The discounted hitting time (DHT), which is a random-walk similarity measure for graph node pairs, has shown to be useful in various applications. In this thesis, we examine a novel query, called the multi-way join (or n-way join), over DHT scores. Given a graph and n sets of nodes, the n-way join retrieves a ranked list of n-tuples with the k highest scores, according to some aggregation function of DHT values. By extracting such top-k results, this query enables the analysis and prediction of various complex relationships among n sets of nodes on a large graph. Since an n-way join is expensive to evaluate, we develop the Partial Join algorithm (or PJ). This solution decomposes an n-way join into a number of top-m 2-way joins, and combines their results to construct the answer of the n-way join. Since the process of PJ may necessitate the computation of top-(m + 1) 2-way joins, we study an incremental solution, which saves the trouble of recomputation and allows the results of top-(m+1) 2-way join to be derived quickly from the top-m 2-way join results earlier computed. For better performance, we further examine efficient processing algorithms and pruning techniques for 2-way joins. Through extensive experiments on three real graph datasets, we show that the proposed PJ algorithm accurately evaluates n-way joins, and is four orders of magnitude faster than basic solutions.
DegreeMaster of Philosophy
SubjectQuery languages (Computer science)
Graph algorithms
Dept/ProgramComputer Science
Persistent Identifierhttp://hdl.handle.net/10722/196484
HKU Library Item IDb5177353

 

DC FieldValueLanguage
dc.contributor.advisorKao, CM-
dc.contributor.advisorCheng, CK-
dc.contributor.authorZhang, Wangda-
dc.contributor.author张望达-
dc.date.accessioned2014-04-11T23:14:30Z-
dc.date.available2014-04-11T23:14:30Z-
dc.date.issued2013-
dc.identifier.citationZhang, W. [张望达]. (2013). Evaluating multi-way joins over discounted hitting time. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5177353-
dc.identifier.urihttp://hdl.handle.net/10722/196484-
dc.description.abstractThe prevalence of graphs in emerging applications has recently raised a lot of research interests. To acquire interesting information hidden in large graphs, tasks including link prediction, collaborative recommendation, and reputation ranking, all make use of proximities between graph nodes. The discounted hitting time (DHT), which is a random-walk similarity measure for graph node pairs, has shown to be useful in various applications. In this thesis, we examine a novel query, called the multi-way join (or n-way join), over DHT scores. Given a graph and n sets of nodes, the n-way join retrieves a ranked list of n-tuples with the k highest scores, according to some aggregation function of DHT values. By extracting such top-k results, this query enables the analysis and prediction of various complex relationships among n sets of nodes on a large graph. Since an n-way join is expensive to evaluate, we develop the Partial Join algorithm (or PJ). This solution decomposes an n-way join into a number of top-m 2-way joins, and combines their results to construct the answer of the n-way join. Since the process of PJ may necessitate the computation of top-(m + 1) 2-way joins, we study an incremental solution, which saves the trouble of recomputation and allows the results of top-(m+1) 2-way join to be derived quickly from the top-m 2-way join results earlier computed. For better performance, we further examine efficient processing algorithms and pruning techniques for 2-way joins. Through extensive experiments on three real graph datasets, we show that the proposed PJ algorithm accurately evaluates n-way joins, and is four orders of magnitude faster than basic solutions.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshQuery languages (Computer science)-
dc.subject.lcshGraph algorithms-
dc.titleEvaluating multi-way joins over discounted hitting time-
dc.typePG_Thesis-
dc.identifier.hkulb5177353-
dc.description.thesisnameMaster of Philosophy-
dc.description.thesislevelMaster-
dc.description.thesisdisciplineComputer Science-
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.5353/th_b5177353-
dc.identifier.mmsid991036763379703414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats