Evaluating probability Threshold k-nearest-neighbor queries over uncertain data

Cheng, R; Chen, L; Chen, J; Xie, X

File Download

re01.htm

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1145/1516360.1516438
Scopus: eid_2-s2.0-70349103656

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- Computer Science: Conference papers

Conference Paper: Evaluating probability Threshold k-nearest-neighbor queries over uncertain data

Title	Evaluating probability Threshold k-nearest-neighbor queries over uncertain data
Authors	Cheng, R Chen, L Chen, J Xie, X
Keywords	Biological managements Candidate selection Candidate sets Efficient data structures Emerging applications
Issue Date	2009
Publisher	Association for Computing Machinery.
Citation	The 12th International Conference on Extending Database Technology (EDBT 2009), St. Petersburg, Russia, 23-26 March 2009. In Proceedings of the 12th International Conference on Extending Database Technology, 2009, p. 672-683 How to Cite? DOI: http://dx.doi.org/10.1145/1516360.1516438
Abstract	In emerging applications such as location-based services, sensor monitoring and biological management systems, the values of the database items are naturally imprecise. For these uncertain databases, an important query is the Probabilistic k-Nearest-Neighbor Query (fc-PNN), which computes the probabilities of sets of k objects for being the closest to a given query point. The evaluation of this query can be both computationally- and I/O- expensive, since there is an exponentially large number of k object-sets, and numerical integration is required. Often a user may not be concerned about the exact probability values. For example, he may only need answers that have sufficiently high confidence. We thus propose the Probabilistic Threshold k-Nearest-Neighbor Query (T-k-PNN), which returns sets of k objects that satisfy the query with probabilities higher than some threshold T. Three steps are proposed to handle this query efficiently. In the first stage, objects that cannot constitute an answer are filtered with the aid of a spatial index. The second step, called probabilistic candidate selection, significantly prunes a number of candidate sets to be examined. The remaining sets are sent for verification, which derives the lower and upper bounds of answer probabilities, so that a candidate set can be quickly decided on whether it should be included in the answer. We also examine spatially-efficient data structures that support these methods. Our solution can be applied to uncertain data with arbitrary probability density functions. We have also performed extensive experiments to examine the effectiveness of our methods. Copyright 2009 ACM.
Persistent Identifier	http://hdl.handle.net/10722/61146
ISBN	9781605584225
References	References in Scopus

DC Field	Value	Language
dc.contributor.author	Cheng, R	en_HK
dc.contributor.author	Chen, L	en_HK
dc.contributor.author	Chen, J	en_HK
dc.contributor.author	Xie, X	en_HK
dc.date.accessioned	2010-07-13T03:31:54Z	-
dc.date.available	2010-07-13T03:31:54Z	-
dc.date.issued	2009	en_HK
dc.identifier.citation	The 12th International Conference on Extending Database Technology (EDBT 2009), St. Petersburg, Russia, 23-26 March 2009. In Proceedings of the 12th International Conference on Extending Database Technology, 2009, p. 672-683	en_HK
dc.identifier.isbn	9781605584225	-
dc.identifier.uri	http://hdl.handle.net/10722/61146	-
dc.description.abstract	In emerging applications such as location-based services, sensor monitoring and biological management systems, the values of the database items are naturally imprecise. For these uncertain databases, an important query is the Probabilistic k-Nearest-Neighbor Query (fc-PNN), which computes the probabilities of sets of k objects for being the closest to a given query point. The evaluation of this query can be both computationally- and I/O- expensive, since there is an exponentially large number of k object-sets, and numerical integration is required. Often a user may not be concerned about the exact probability values. For example, he may only need answers that have sufficiently high confidence. We thus propose the Probabilistic Threshold k-Nearest-Neighbor Query (T-k-PNN), which returns sets of k objects that satisfy the query with probabilities higher than some threshold T. Three steps are proposed to handle this query efficiently. In the first stage, objects that cannot constitute an answer are filtered with the aid of a spatial index. The second step, called probabilistic candidate selection, significantly prunes a number of candidate sets to be examined. The remaining sets are sent for verification, which derives the lower and upper bounds of answer probabilities, so that a candidate set can be quickly decided on whether it should be included in the answer. We also examine spatially-efficient data structures that support these methods. Our solution can be applied to uncertain data with arbitrary probability density functions. We have also performed extensive experiments to examine the effectiveness of our methods. Copyright 2009 ACM.	en_HK
dc.language	eng	en_HK
dc.publisher	Association for Computing Machinery.	-
dc.relation.ispartof	Proceedings of the 12th International Conference on Extending Database Technology	en_HK
dc.subject	Biological managements	-
dc.subject	Candidate selection	-
dc.subject	Candidate sets	-
dc.subject	Efficient data structures	-
dc.subject	Emerging applications	-
dc.title	Evaluating probability Threshold k-nearest-neighbor queries over uncertain data	en_HK
dc.type	Conference_Paper	en_HK
dc.identifier.openurl	http://library.hku.hk:4550/resserv?sid=HKU:IR&issn=9781605584225&volume=&spage=672&epage=683&date=2009&atitle=Evaluating+probability+threshold+k-nearest-neighbor+queries+over+uncertain+data	-
dc.identifier.email	Cheng, R:ckcheng@cs.hku.hk	en_HK
dc.identifier.authority	Cheng, R=rp00074	en_HK
dc.description.nature	link_to_OA_fulltext	-
dc.identifier.doi	10.1145/1516360.1516438	en_HK
dc.identifier.scopus	eid_2-s2.0-70349103656	en_HK
dc.identifier.hkuros	162401	en_HK
dc.relation.references	http://www.scopus.com/mlt/select.url?eid=2-s2.0-70349103656&selection=ref&src=s&origin=recordpage	en_HK
dc.identifier.spage	672	en_HK
dc.identifier.epage	683	en_HK
dc.description.other	The 12th International Conference on Extending Database Technology (EDBT 2009), St. Petersburg, Russia, 23-26 March 2009. In Proceedings of the 12th International Conference on Extending Database Technology, 2009, p. 672-683	-
dc.identifier.scopusauthorid	Cheng, R=7201955416	en_HK
dc.identifier.scopusauthorid	Chen, L=25652992200	en_HK
dc.identifier.scopusauthorid	Chen, J=36692766900	en_HK
dc.identifier.scopusauthorid	Xie, X=34881209700	en_HK

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Evaluating probability Threshold k-nearest-neighbor queries over uncertain data

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats