File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Advanced ranking queries on composite data
Title | Advanced ranking queries on composite data |
---|---|
Authors | |
Issue Date | 2016 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Qi, S. [齊書堯]. (2016). Advanced ranking queries on composite data. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Ranking and retrieving the best objects from a database based on a set of criteria is a fundamental problem and has received extensive research efforts. With the vast development of data science and engineering, modern data have become increasingly more complex and composite, i.e., objects are routinely assigned multiple types of information. This thesis studies several advanced ranking queries over composite data. In particular, three novel ranking queries are investigated in detail.
First, we introduce and study the problem of top-k joins over complex data types. Top-k joins have been extensively studied in relational databases, for the case where the join predicate is equality and the proposed algorithms aim at minimizing the number of accesses from the inputs. However, when collections of complex data types (e.g., spatial or string datasets) are top-k joined, computational cost can easily become the bottleneck. In view of this, we propose a novel evaluation paradigm that minimizes the computational cost without compromising the access cost. The proposed paradigm is applied for the cases of top-k joins on spatial and string attributes, and an analysis is conducted on how to optimize the paradigm for each case. Finally, the proposal is evaluated by extensive experimentation on both real and synthetic data.
Next, the problem of point-based trajectory search is investigated. Trajectory data capture the traveling history of moving objects. With the vastly increased volume of trajectory collections, applications such as route recommendation and traveling behavior mining call for efficient trajectory retrieval. This thesis firstly studies distance-to-points trajectory search (DTS) which retrieves the top-k trajectories that pass as close as possible to a given set of query points. For this, the state-of-the-art is advanced by a hybrid method combining existing approaches and an alternative yet more efficient spatial range-based approach. Second, the continuous counterpart of DTS is investigated where the query is long-standing and the results need to be maintained whenever updates occur to the query and/or the data. Third, two practical variants of DTS, which take into account the temporal characteristics of the searched trajectories, are proposed and studied. Extensive experiments are conducted to evaluate the proposed algorithms.
Finally, the problem of location-aware keyword query suggestion (LKS) is proposed and studied. Keyword suggestion helps users to access relevant information without having to know how to precisely express their queries. Existing techniques consider solely the keyword proximity and neglect the spatial distance of a user to the retrieved results. However, the relevance of search results in many applications (e.g., location-based services) is known to be correlated with their spatial proximity to the query issuer. This thesis presents an LKS framework, where a weighted keyword-document graph is designed to capture both the semantic relevance between keyword queries and the spatial distance between the resulting documents and the user. The graph is browsed in a random-walk-with-restart fashion, and to make it scalable, we propose a partition-based approach which vastly outperforms the baseline. The appropriateness of the LKS framework and the performance of the algorithms are evaluated extensively using real data. |
Degree | Doctor of Philosophy |
Subject | Database management |
Dept/Program | Computer Science |
Persistent Identifier | http://hdl.handle.net/10722/235931 |
HKU Library Item ID | b5801656 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Qi, Shuyao | - |
dc.contributor.author | 齊書堯 | - |
dc.date.accessioned | 2016-11-09T23:27:05Z | - |
dc.date.available | 2016-11-09T23:27:05Z | - |
dc.date.issued | 2016 | - |
dc.identifier.citation | Qi, S. [齊書堯]. (2016). Advanced ranking queries on composite data. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/235931 | - |
dc.description.abstract | Ranking and retrieving the best objects from a database based on a set of criteria is a fundamental problem and has received extensive research efforts. With the vast development of data science and engineering, modern data have become increasingly more complex and composite, i.e., objects are routinely assigned multiple types of information. This thesis studies several advanced ranking queries over composite data. In particular, three novel ranking queries are investigated in detail. First, we introduce and study the problem of top-k joins over complex data types. Top-k joins have been extensively studied in relational databases, for the case where the join predicate is equality and the proposed algorithms aim at minimizing the number of accesses from the inputs. However, when collections of complex data types (e.g., spatial or string datasets) are top-k joined, computational cost can easily become the bottleneck. In view of this, we propose a novel evaluation paradigm that minimizes the computational cost without compromising the access cost. The proposed paradigm is applied for the cases of top-k joins on spatial and string attributes, and an analysis is conducted on how to optimize the paradigm for each case. Finally, the proposal is evaluated by extensive experimentation on both real and synthetic data. Next, the problem of point-based trajectory search is investigated. Trajectory data capture the traveling history of moving objects. With the vastly increased volume of trajectory collections, applications such as route recommendation and traveling behavior mining call for efficient trajectory retrieval. This thesis firstly studies distance-to-points trajectory search (DTS) which retrieves the top-k trajectories that pass as close as possible to a given set of query points. For this, the state-of-the-art is advanced by a hybrid method combining existing approaches and an alternative yet more efficient spatial range-based approach. Second, the continuous counterpart of DTS is investigated where the query is long-standing and the results need to be maintained whenever updates occur to the query and/or the data. Third, two practical variants of DTS, which take into account the temporal characteristics of the searched trajectories, are proposed and studied. Extensive experiments are conducted to evaluate the proposed algorithms. Finally, the problem of location-aware keyword query suggestion (LKS) is proposed and studied. Keyword suggestion helps users to access relevant information without having to know how to precisely express their queries. Existing techniques consider solely the keyword proximity and neglect the spatial distance of a user to the retrieved results. However, the relevance of search results in many applications (e.g., location-based services) is known to be correlated with their spatial proximity to the query issuer. This thesis presents an LKS framework, where a weighted keyword-document graph is designed to capture both the semantic relevance between keyword queries and the spatial distance between the resulting documents and the user. The graph is browsed in a random-walk-with-restart fashion, and to make it scalable, we propose a partition-based approach which vastly outperforms the baseline. The appropriateness of the LKS framework and the performance of the algorithms are evaluated extensively using real data. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Database management | - |
dc.title | Advanced ranking queries on composite data | - |
dc.type | PG_Thesis | - |
dc.identifier.hkul | b5801656 | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Computer Science | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.5353/th_b5801656 | - |
dc.identifier.mmsid | 991020813859703414 | - |