File Download
Supplementary

postgraduate thesis: Queries and analysis tasks on semantically rich spatial data

TitleQueries and analysis tasks on semantically rich spatial data
Authors
Issue Date2015
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Shi, J. [石杰明]. (2015). Queries and analysis tasks on semantically rich spatial data. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5699934
AbstractSemantically rich spatial data are big and ubiquitous, raising challenges with respect to their effective and efficient querying and analysis. In particular, traditional spatial analysis and querying methods are not readily applicable due to the increased data complexity. Toward addressing these challenges and supporting real-life applications that manage such data, in this thesis, three problems on the querying and analysis of (i) geo-social network data, (ii) spatio-textual data, and (iii) spatial RDF data are proposed and studied. First, we study the problem of Density-based Clustering of Places in Geo-Social networks (DCPGS). Current spatial clustering models disregard information about the people who are related to the clustered places. We extend the density-based clustering paradigm to apply on places in geo-social networks, considering both the spatial information between places and the social relationships between users who visit the places. After formally defining our model and the distance measure it relies on, we present efficient index-based algorithms for its implementation. We evaluate the effectiveness of our model via a case study and two quantitative measures, called social entropy and community score, which indicate that geo-social clusters have special properties and cannot be found by applying simple spatial clustering approaches. The efficiency of our algorithms is also evaluated experimentally. Next, we study the modeling and evaluation of a Spatio-Textual Skyline (STS) query, in which the skyline points are selected based on not only their distances to a set of query locations, but also on their relevance to a set of query keywords. STS is especially relevant to modern applications, where points of interest are typically augmented with textual descriptions. We investigate three models for integrating textual relevance into the spatial skyline. Among them, model STD, combining spatial distance with textual relevance in a derived dimensional space, is the most effective one. STD computes a skyline satisfying the intent of STS, and having a small and easy-to-interpret size. We propose an IR-tree based algorithm for computing STD-based skylines. The effectiveness of our STD model and the efficiency of the algorithm are evaluated experimentally. Finally, we propose the problem of top-k relevant Semantic Place retrieval (kSP) on spatial RDF data, which finds applications in domains such as journalism, health, business, and tourism. Traditionally, RDF data is accessed by structured query languages, e.g., SPARQL. This requires users to understand both the language and the RDF schema. Recent research on keyword search over RDF data aims at reducing such requirements, but still ignores the spatial dimension of RDF data. Our kSP seeks for RDF subgraphs, rooted at spatial entities close to the query location and containing a set of query keywords. Compared to existing work, kSP queries are independent to structured query languages and they are location-aware. We devise a basic method for processing kSP queries. Two pruning approaches and a preprocessing technique are proposed to further improve efficiency. Experiments on real datasets demonstrate the superior and robust performance of our proposals compared to the basic method.
DegreeDoctor of Philosophy
SubjectSpatial analysis (Statistics)
Dept/ProgramComputer Science
Persistent Identifierhttp://hdl.handle.net/10722/223015

 

DC FieldValueLanguage
dc.contributor.authorShi, Jieming-
dc.contributor.author石杰明-
dc.date.accessioned2016-02-17T23:14:32Z-
dc.date.available2016-02-17T23:14:32Z-
dc.date.issued2015-
dc.identifier.citationShi, J. [石杰明]. (2015). Queries and analysis tasks on semantically rich spatial data. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5699934-
dc.identifier.urihttp://hdl.handle.net/10722/223015-
dc.description.abstractSemantically rich spatial data are big and ubiquitous, raising challenges with respect to their effective and efficient querying and analysis. In particular, traditional spatial analysis and querying methods are not readily applicable due to the increased data complexity. Toward addressing these challenges and supporting real-life applications that manage such data, in this thesis, three problems on the querying and analysis of (i) geo-social network data, (ii) spatio-textual data, and (iii) spatial RDF data are proposed and studied. First, we study the problem of Density-based Clustering of Places in Geo-Social networks (DCPGS). Current spatial clustering models disregard information about the people who are related to the clustered places. We extend the density-based clustering paradigm to apply on places in geo-social networks, considering both the spatial information between places and the social relationships between users who visit the places. After formally defining our model and the distance measure it relies on, we present efficient index-based algorithms for its implementation. We evaluate the effectiveness of our model via a case study and two quantitative measures, called social entropy and community score, which indicate that geo-social clusters have special properties and cannot be found by applying simple spatial clustering approaches. The efficiency of our algorithms is also evaluated experimentally. Next, we study the modeling and evaluation of a Spatio-Textual Skyline (STS) query, in which the skyline points are selected based on not only their distances to a set of query locations, but also on their relevance to a set of query keywords. STS is especially relevant to modern applications, where points of interest are typically augmented with textual descriptions. We investigate three models for integrating textual relevance into the spatial skyline. Among them, model STD, combining spatial distance with textual relevance in a derived dimensional space, is the most effective one. STD computes a skyline satisfying the intent of STS, and having a small and easy-to-interpret size. We propose an IR-tree based algorithm for computing STD-based skylines. The effectiveness of our STD model and the efficiency of the algorithm are evaluated experimentally. Finally, we propose the problem of top-k relevant Semantic Place retrieval (kSP) on spatial RDF data, which finds applications in domains such as journalism, health, business, and tourism. Traditionally, RDF data is accessed by structured query languages, e.g., SPARQL. This requires users to understand both the language and the RDF schema. Recent research on keyword search over RDF data aims at reducing such requirements, but still ignores the spatial dimension of RDF data. Our kSP seeks for RDF subgraphs, rooted at spatial entities close to the query location and containing a set of query keywords. Compared to existing work, kSP queries are independent to structured query languages and they are location-aware. We devise a basic method for processing kSP queries. Two pruning approaches and a preprocessing technique are proposed to further improve efficiency. Experiments on real datasets demonstrate the superior and robust performance of our proposals compared to the basic method.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsCreative Commons: Attribution 3.0 Hong Kong License-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.subject.lcshSpatial analysis (Statistics)-
dc.titleQueries and analysis tasks on semantically rich spatial data-
dc.typePG_Thesis-
dc.identifier.hkulb5699934-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineComputer Science-
dc.description.naturepublished_or_final_version-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats