File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Accelerating probabilistic frequent itemset mining: A model-based approach

TitleAccelerating probabilistic frequent itemset mining: A model-based approach
Authors
KeywordsAlgorithms
Issue Date2010
PublisherAssociation for Computing Machinery.
Citation
The 19th ACM International Conference on Information and Knowledge Management (CIKM 2010), Toronto, Canada, 26-30 October 2010. In Proceedings of the 19th ACM international conference on Information and knowledge management, 2010, p. 429-438 How to Cite?
AbstractData uncertainty is inherent in emerging applications such as location-based services, sensor monitoring systems, and data integration. To handle a large amount of imprecise information, uncertain databases have been recently developed. In this paper, we study how to efficiently discover frequent itemsets from large uncertain databases, interpreted under the Possible World Semantics. This is technically challenging, since an uncertain database induces an exponential number of possible worlds. To tackle this problem, we propose a novel method to capture the itemset mining process as a Poisson binomial distribution. This model-based approach extracts frequent itemsets with a high degree of accuracy, and supports large databases. We apply our techniques to improve the performance of the algorithms for: (1) finding itemsets whose frequentness probabilities are larger than some threshold; and (2) mining itemsets with the k highest frequentness probabilities. Our approaches support both tuple and attribute uncertainty models, which are commonly used to represent uncertain databases. Extensive evaluation on real and synthetic datasets shows that our methods are highly accurate. Moreover, they are orders of magnitudes faster than previous approaches. © 2010 ACM.
Persistent Identifierhttp://hdl.handle.net/10722/129566
ISSN
References

 

DC FieldValueLanguage
dc.contributor.authorWang, Len_HK
dc.contributor.authorCheng, Ren_HK
dc.contributor.authorLee, SDen_HK
dc.contributor.authorCheung, DWen_HK
dc.date.accessioned2010-12-23T08:39:20Z-
dc.date.available2010-12-23T08:39:20Z-
dc.date.issued2010en_HK
dc.identifier.citationThe 19th ACM International Conference on Information and Knowledge Management (CIKM 2010), Toronto, Canada, 26-30 October 2010. In Proceedings of the 19th ACM international conference on Information and knowledge management, 2010, p. 429-438en_HK
dc.identifier.issn978-1-4503-0099-5-
dc.identifier.urihttp://hdl.handle.net/10722/129566-
dc.description.abstractData uncertainty is inherent in emerging applications such as location-based services, sensor monitoring systems, and data integration. To handle a large amount of imprecise information, uncertain databases have been recently developed. In this paper, we study how to efficiently discover frequent itemsets from large uncertain databases, interpreted under the Possible World Semantics. This is technically challenging, since an uncertain database induces an exponential number of possible worlds. To tackle this problem, we propose a novel method to capture the itemset mining process as a Poisson binomial distribution. This model-based approach extracts frequent itemsets with a high degree of accuracy, and supports large databases. We apply our techniques to improve the performance of the algorithms for: (1) finding itemsets whose frequentness probabilities are larger than some threshold; and (2) mining itemsets with the k highest frequentness probabilities. Our approaches support both tuple and attribute uncertainty models, which are commonly used to represent uncertain databases. Extensive evaluation on real and synthetic datasets shows that our methods are highly accurate. Moreover, they are orders of magnitudes faster than previous approaches. © 2010 ACM.en_HK
dc.languageengen_US
dc.publisherAssociation for Computing Machinery.-
dc.relation.ispartofInternational Conference on Information and Knowledge Management, Proceedingsen_HK
dc.subjectAlgorithmsen_HK
dc.titleAccelerating probabilistic frequent itemset mining: A model-based approachen_HK
dc.typeConference_Paperen_HK
dc.identifier.emailCheng, R:ckcheng@cs.hku.hken_HK
dc.identifier.emailCheung, DW:dcheung@cs.hku.hken_HK
dc.identifier.authorityCheng, R=rp00074en_HK
dc.identifier.authorityCheung, DW=rp00101en_HK
dc.description.naturelink_to_OA_fulltext-
dc.identifier.doi10.1145/1871437.1871494en_HK
dc.identifier.scopuseid_2-s2.0-78651291608en_HK
dc.identifier.hkuros176457en_US
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-78651291608&selection=ref&src=s&origin=recordpageen_HK
dc.identifier.spage429en_HK
dc.identifier.epage438en_HK
dc.publisher.placeUnited States-
dc.description.otherThe 19th ACM International Conference on Information and Knowledge Management (CIKM 2010), Toronto, Canada, 26-30 October 2010. In Proceedings of the 19th ACM international conference on Information and knowledge management, 2010, p. 429-438-
dc.identifier.scopusauthoridWang, L=36769568800en_HK
dc.identifier.scopusauthoridCheng, R=7201955416en_HK
dc.identifier.scopusauthoridLee, SD=7601400741en_HK
dc.identifier.scopusauthoridCheung, DW=34567902600en_HK

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats