File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Book Chapter: Managing Quality of Probabilistic Databases

TitleManaging Quality of Probabilistic Databases
Authors
Issue Date2013
PublisherSpringer-Verlag
Citation
Managing Quality of Probabilistic Databases. In Sadiq, S (Ed.), Handbook of Data Quality: Research and Practice, p. 271-291. Berlin; New York: Springer-Verlag, 2013 How to Cite?
AbstractUncertain or imprecise data are pervasive in applications like location-based services, sensor monitoring, and data collection and integration. For these applications, probabilistic databases can be used to store uncertain data, and querying facilities are provided to yield answers with statistical confidence. Given that a limited amount of resources is available to “clean” the database (e.g., by probing some sensor data values to get their latest values), we address the problem of choosing the set of uncertain objects to be cleaned, in order to achieve the best improvement in the quality of query answers. For this purpose, we present the PWS-quality metric, which is a universal measure that quantifies the ambiguity of query answers under the possible world semantics. We study how PWS-quality can be efficiently evaluated for two major query classes: (1) queries that examine the satisfiability of tuples independent of other tuples (e.g., range queries) and (2) queries that require the knowledge of the relative ranking of the tuples (e.g., MAX queries). We then propose a polynomial-time solution to achieve an optimal improvement in PWS-quality. Other fast heuristics are also examined.
Persistent Identifierhttp://hdl.handle.net/10722/166461
ISSN

 

DC FieldValueLanguage
dc.contributor.authorCheng, RCKen_US
dc.date.accessioned2012-09-20T08:36:32Z-
dc.date.available2012-09-20T08:36:32Z-
dc.date.issued2013en_US
dc.identifier.citationManaging Quality of Probabilistic Databases. In Sadiq, S (Ed.), Handbook of Data Quality: Research and Practice, p. 271-291. Berlin; New York: Springer-Verlag, 2013-
dc.identifier.issn9783642362569-
dc.identifier.urihttp://hdl.handle.net/10722/166461-
dc.description.abstractUncertain or imprecise data are pervasive in applications like location-based services, sensor monitoring, and data collection and integration. For these applications, probabilistic databases can be used to store uncertain data, and querying facilities are provided to yield answers with statistical confidence. Given that a limited amount of resources is available to “clean” the database (e.g., by probing some sensor data values to get their latest values), we address the problem of choosing the set of uncertain objects to be cleaned, in order to achieve the best improvement in the quality of query answers. For this purpose, we present the PWS-quality metric, which is a universal measure that quantifies the ambiguity of query answers under the possible world semantics. We study how PWS-quality can be efficiently evaluated for two major query classes: (1) queries that examine the satisfiability of tuples independent of other tuples (e.g., range queries) and (2) queries that require the knowledge of the relative ranking of the tuples (e.g., MAX queries). We then propose a polynomial-time solution to achieve an optimal improvement in PWS-quality. Other fast heuristics are also examined.-
dc.languageengen_US
dc.publisherSpringer-Verlagen_US
dc.relation.ispartofHandbook of Data Quality: Research and Practice-
dc.titleManaging Quality of Probabilistic Databasesen_US
dc.typeBook_Chapteren_US
dc.identifier.emailCheng, RCK: ckcheng@cs.hku.hken_US
dc.identifier.authorityCheng, RCK=rp00074en_US
dc.identifier.doi10.1007/978-3-642-36257-6_12-
dc.identifier.hkuros206199en_US
dc.identifier.hkuros224491-
dc.identifier.spage271-
dc.identifier.epage291-
dc.publisher.placeBerlin; New York-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats