File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Cleaning uncertain data for top-k queries

TitleCleaning uncertain data for top-k queries
Authors
KeywordsCleaning operations
Data uncertainty
Emerging applications
Greedy algorithms
Optimal solutions
Possible world semantics
Probabilistic database
Temperature values
Issue Date2013
PublisherIEEE, Computer Society. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000178
Citation
The 29th International Conference on Data Engineering (ICDE 2013), Brisbane, Australia, 8-11 April 2013. In International Conference on Data Engineering Proceedings, 2013, p. 134-145 How to Cite?
AbstractThe information managed in emerging applications, such as sensor networks, location-based services, and data integration, is inherently imprecise. To handle data uncertainty, probabilistic databases have been recently developed. In this paper, we study how to quantify the ambiguity of answers returned by a probabilistic top-k query. We develop efficient algorithms to compute the quality of this query under the possible world semantics. We further address the cleaning of a probabilistic database, in order to improve top-k query quality. Cleaning involves the reduction of ambiguity associated with the database entities. For example, the uncertainty of a temperature value acquired from a sensor can be reduced, or cleaned, by requesting its newest value from the sensor. While this 'cleaning operation' may produce a better query result, it may involve a cost and fail. We investigate the problem of selecting entities to be cleaned under a limited budget. Particularly, we propose an optimal solution and several heuristics. Experiments show that the greedy algorithm is efficient and close to optimal. © 2013 IEEE.
Persistent Identifierhttp://hdl.handle.net/10722/189637
ISBN
ISSN

 

DC FieldValueLanguage
dc.contributor.authorMo, Len_US
dc.contributor.authorCheng, Ren_US
dc.contributor.authorLi, Xen_US
dc.contributor.authorCheung, DWLen_US
dc.contributor.authorYang, XSen_US
dc.date.accessioned2013-09-17T14:50:33Z-
dc.date.available2013-09-17T14:50:33Z-
dc.date.issued2013en_US
dc.identifier.citationThe 29th International Conference on Data Engineering (ICDE 2013), Brisbane, Australia, 8-11 April 2013. In International Conference on Data Engineering Proceedings, 2013, p. 134-145en_US
dc.identifier.isbn978-1-4673-4910-9-
dc.identifier.issn1084-4627-
dc.identifier.urihttp://hdl.handle.net/10722/189637-
dc.description.abstractThe information managed in emerging applications, such as sensor networks, location-based services, and data integration, is inherently imprecise. To handle data uncertainty, probabilistic databases have been recently developed. In this paper, we study how to quantify the ambiguity of answers returned by a probabilistic top-k query. We develop efficient algorithms to compute the quality of this query under the possible world semantics. We further address the cleaning of a probabilistic database, in order to improve top-k query quality. Cleaning involves the reduction of ambiguity associated with the database entities. For example, the uncertainty of a temperature value acquired from a sensor can be reduced, or cleaned, by requesting its newest value from the sensor. While this 'cleaning operation' may produce a better query result, it may involve a cost and fail. We investigate the problem of selecting entities to be cleaned under a limited budget. Particularly, we propose an optimal solution and several heuristics. Experiments show that the greedy algorithm is efficient and close to optimal. © 2013 IEEE.-
dc.languageengen_US
dc.publisherIEEE, Computer Society. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000178-
dc.relation.ispartofInternational Conference on Data Engineering Proceedingsen_US
dc.rightsInternational Conference on Data Engineering. Proceedings. Copyright © IEEE, Computer Society.-
dc.rights©2013 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.-
dc.rightsCreative Commons: Attribution 3.0 Hong Kong License-
dc.subjectCleaning operations-
dc.subjectData uncertainty-
dc.subjectEmerging applications-
dc.subjectGreedy algorithms-
dc.subjectOptimal solutions-
dc.subjectPossible world semantics-
dc.subjectProbabilistic database-
dc.subjectTemperature values-
dc.titleCleaning uncertain data for top-k queriesen_US
dc.typeConference_Paperen_US
dc.identifier.emailMo, L: lymo@cs.hku.hken_US
dc.identifier.emailCheng, R: ckcheng@cs.hku.hken_US
dc.identifier.emailLi, X: xli@cs.hku.hk-
dc.identifier.emailCheung, DWL: dcheung@cs.hku.hk-
dc.identifier.emailYang, XS: xyang2@cs.hku.hk-
dc.identifier.authorityCheng, R=rp00074en_US
dc.identifier.authorityCheung, DWL=rp00101en_US
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.1109/ICDE.2013.6544820-
dc.identifier.scopuseid_2-s2.0-84881328468-
dc.identifier.hkuros222869en_US
dc.identifier.spage134-
dc.identifier.epage145-
dc.publisher.placeUnited States-
dc.customcontrol.immutablesml 131023-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats