File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

postgraduate thesis: Advanced rank-aware queries and recommendation with novel types of data

TitleAdvanced rank-aware queries and recommendation with novel types of data
Authors
Advisors
Issue Date2014
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Wang, H. [王皓]. (2014). Advanced rank-aware queries and recommendation with novel types of data. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5270554
AbstractNowadays we are living in an era of rich data, not only in the sense of the amount of data, but also in the sense of various sources and content of data. Efficient search, management, and exploitation of data have, over decades, been a major direction of database research. In this thesis, three challenging problems are proposed and studied, targeting (i) time series data, (ii) user preference data, and (iii) location-based social network data, respectively, providing efficient solutions to corresponding real-life applications. First, durability queries are studied in historical time series databases, which identify objects that have durable quality over time. For example, a sociologist may be interested in the top 10 web search terms during the period of some historical events; the police may seek for vehicles that move close to a suspect 70% of the time during a certain time, etc. Such durable top-k (DTop-k) and durable k-nearest neighbor (DkNN) queries can be viewed as natural extensions of the standard snapshot top-k and NN queries to timestamped sequences of values or locations. Although their snapshot counterparts have been studied extensively, there is little prior work that addresses this new class of durability queries. Efficient and scalable algorithms are proposed based on novel indexing techniques. Next, an efficient solution to k-nearest neighbor search over top-m lists is investigated. A top-m list is a ranking of m items, typically representing some user’s preference over these items. For example, a user may have a list of her 10 most favourite books; the result from a search engine is typically a list of webpages ranked according to their relevance to some keywords. The search problem aims at extracting k top-m lists from the database that are the “closest” to some query list where the closeness is evaluated using commonly used measures such as the Fagin’s intersection metric, Spearman’s footrule, Kendall’s tau, etc. Despite of the importance of such queries, there’s little prior work suggesting any efficient solution. In this thesis, a unified framework is proposed to answer such queries efficiently. Finally, the problem of top-N venue recommendation in location-based social networks (LBSNs) is studied, which recommends new venues to users. As an increasingly larger number of users partake in LBSNs, the recommendation problem in this setting has attracted significant attention in research and in practical applications. The detailed information about past user behavior that is traced by the LBSN differentiates the problem significantly from its traditional settings. The spatial nature in the past user behavior and also the information about the user social interaction with other users, provide a richer background to build a more accurate and expressive recommendation model. Although there have been extensive studies on recommender systems working with user-item ratings, GPS trajectories, and other types of data, there are very few approaches that exploit the unique properties of the LBSN user check-in data. In this thesis, effective and efficient algorithms that create recommendations are proposed based on such properties.
DegreeDoctor of Philosophy
SubjectData mining
Time-series analysis - Computer programs
Social networks - Data processing
Dept/ProgramComputer Science
Persistent Identifierhttp://hdl.handle.net/10722/206672

 

DC FieldValueLanguage
dc.contributor.advisorMamoulis, N-
dc.contributor.advisorCheung, DWL-
dc.contributor.authorWang, Hao-
dc.contributor.author王皓-
dc.date.accessioned2014-11-25T03:53:15Z-
dc.date.available2014-11-25T03:53:15Z-
dc.date.issued2014-
dc.identifier.citationWang, H. [王皓]. (2014). Advanced rank-aware queries and recommendation with novel types of data. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5270554-
dc.identifier.urihttp://hdl.handle.net/10722/206672-
dc.description.abstractNowadays we are living in an era of rich data, not only in the sense of the amount of data, but also in the sense of various sources and content of data. Efficient search, management, and exploitation of data have, over decades, been a major direction of database research. In this thesis, three challenging problems are proposed and studied, targeting (i) time series data, (ii) user preference data, and (iii) location-based social network data, respectively, providing efficient solutions to corresponding real-life applications. First, durability queries are studied in historical time series databases, which identify objects that have durable quality over time. For example, a sociologist may be interested in the top 10 web search terms during the period of some historical events; the police may seek for vehicles that move close to a suspect 70% of the time during a certain time, etc. Such durable top-k (DTop-k) and durable k-nearest neighbor (DkNN) queries can be viewed as natural extensions of the standard snapshot top-k and NN queries to timestamped sequences of values or locations. Although their snapshot counterparts have been studied extensively, there is little prior work that addresses this new class of durability queries. Efficient and scalable algorithms are proposed based on novel indexing techniques. Next, an efficient solution to k-nearest neighbor search over top-m lists is investigated. A top-m list is a ranking of m items, typically representing some user’s preference over these items. For example, a user may have a list of her 10 most favourite books; the result from a search engine is typically a list of webpages ranked according to their relevance to some keywords. The search problem aims at extracting k top-m lists from the database that are the “closest” to some query list where the closeness is evaluated using commonly used measures such as the Fagin’s intersection metric, Spearman’s footrule, Kendall’s tau, etc. Despite of the importance of such queries, there’s little prior work suggesting any efficient solution. In this thesis, a unified framework is proposed to answer such queries efficiently. Finally, the problem of top-N venue recommendation in location-based social networks (LBSNs) is studied, which recommends new venues to users. As an increasingly larger number of users partake in LBSNs, the recommendation problem in this setting has attracted significant attention in research and in practical applications. The detailed information about past user behavior that is traced by the LBSN differentiates the problem significantly from its traditional settings. The spatial nature in the past user behavior and also the information about the user social interaction with other users, provide a richer background to build a more accurate and expressive recommendation model. Although there have been extensive studies on recommender systems working with user-item ratings, GPS trajectories, and other types of data, there are very few approaches that exploit the unique properties of the LBSN user check-in data. In this thesis, effective and efficient algorithms that create recommendations are proposed based on such properties.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsCreative Commons: Attribution 3.0 Hong Kong License-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.subject.lcshData mining-
dc.subject.lcshTime-series analysis - Computer programs-
dc.subject.lcshSocial networks - Data processing-
dc.titleAdvanced rank-aware queries and recommendation with novel types of data-
dc.typePG_Thesis-
dc.identifier.hkulb5270554-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineComputer Science-
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.5353/th_b5270554-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats