File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: A lattice-based approach for I/O efficient association rule mining

TitleA lattice-based approach for I/O efficient association rule mining
Authors
KeywordsApriori
Association rules
Data mining
FindLarge
Lattice
LGen
Issue Date2002
PublisherPergamon. The Journal's web site is located at http://www.elsevier.com/locate/is
Citation
Information Systems, 2002, v. 27 n. 1, p. 41-74 How to Cite?
AbstractMost algorithms for association rule mining are variants of the basic Apriori algorithm (Agarwal and Srikant, Fast algorithms for mining association rules in databases, in: Proceedings of the 20th International Conference on Very Large Data Bases (VLDB'94), Santiago, Chile, 1994, pp. 487-499). One characteristic of these Apriori-based algorithms is that candidate itemsets are generated in rounds, with the size of the itemsets incremented by one per round. The number of database scans required by Apriori-based algorithms thus depends on the size of the biggest frequent itemsets. In this paper, we devise a more general candidate set generation algorithm, LGen, which generates candidate itemsets of multiple sizes during each database scan. We present an algorithm FindLarge which uses LGen to find frequent itemsets. We show that, given a reasonable set of suggested frequent itemsets, FindLarge can significantly reduce the number of I/O passes required. In the best cases, only two passes are sufficient to discover all the frequent itemsets irrespective of the size of the biggest ones. Two I/O-saving algorithms, namely DIC and Pincher-Search, are compared with FindLarge in a series of experiments. We discuss the conditions under which FindLarge significantly outperforms the others in terms of I/O efficiency. © 2002 Elsevier Science Ltd. All rights reserved.
Persistent Identifierhttp://hdl.handle.net/10722/89013
ISSN
2023 Impact Factor: 3.0
2023 SCImago Journal Rankings: 1.201
ISI Accession Number ID
References

 

DC FieldValueLanguage
dc.contributor.authorLoo, KKen_HK
dc.contributor.authorYip, CLen_HK
dc.contributor.authorKao, Ben_HK
dc.contributor.authorCheung, Den_HK
dc.date.accessioned2010-09-06T09:51:18Z-
dc.date.available2010-09-06T09:51:18Z-
dc.date.issued2002en_HK
dc.identifier.citationInformation Systems, 2002, v. 27 n. 1, p. 41-74en_HK
dc.identifier.issn0306-4379en_HK
dc.identifier.urihttp://hdl.handle.net/10722/89013-
dc.description.abstractMost algorithms for association rule mining are variants of the basic Apriori algorithm (Agarwal and Srikant, Fast algorithms for mining association rules in databases, in: Proceedings of the 20th International Conference on Very Large Data Bases (VLDB'94), Santiago, Chile, 1994, pp. 487-499). One characteristic of these Apriori-based algorithms is that candidate itemsets are generated in rounds, with the size of the itemsets incremented by one per round. The number of database scans required by Apriori-based algorithms thus depends on the size of the biggest frequent itemsets. In this paper, we devise a more general candidate set generation algorithm, LGen, which generates candidate itemsets of multiple sizes during each database scan. We present an algorithm FindLarge which uses LGen to find frequent itemsets. We show that, given a reasonable set of suggested frequent itemsets, FindLarge can significantly reduce the number of I/O passes required. In the best cases, only two passes are sufficient to discover all the frequent itemsets irrespective of the size of the biggest ones. Two I/O-saving algorithms, namely DIC and Pincher-Search, are compared with FindLarge in a series of experiments. We discuss the conditions under which FindLarge significantly outperforms the others in terms of I/O efficiency. © 2002 Elsevier Science Ltd. All rights reserved.en_HK
dc.languageengen_HK
dc.publisherPergamon. The Journal's web site is located at http://www.elsevier.com/locate/isen_HK
dc.relation.ispartofInformation Systemsen_HK
dc.subjectApriorien_HK
dc.subjectAssociation rulesen_HK
dc.subjectData miningen_HK
dc.subjectFindLargeen_HK
dc.subjectLatticeen_HK
dc.subjectLGenen_HK
dc.titleA lattice-based approach for I/O efficient association rule miningen_HK
dc.typeArticleen_HK
dc.identifier.openurlhttp://library.hku.hk:4550/resserv?sid=HKU:IR&issn=0306-4379&volume=27&issue=1&spage=41&epage=74&date=2002&atitle=A+Lattice-Based+Approach+for+I/O+Efficient+Association+Rule+Miningen_HK
dc.identifier.emailYip, CL:clyip@cs.hku.hken_HK
dc.identifier.emailKao, B:kao@cs.hku.hken_HK
dc.identifier.emailCheung, D:dcheung@cs.hku.hken_HK
dc.identifier.authorityYip, CL=rp00205en_HK
dc.identifier.authorityKao, B=rp00123en_HK
dc.identifier.authorityCheung, D=rp00101en_HK
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1016/S0306-4379(01)00046-1en_HK
dc.identifier.scopuseid_2-s2.0-0036498208en_HK
dc.identifier.hkuros67151en_HK
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-0036498208&selection=ref&src=s&origin=recordpageen_HK
dc.identifier.volume27en_HK
dc.identifier.issue1en_HK
dc.identifier.spage41en_HK
dc.identifier.epage74en_HK
dc.identifier.isiWOS:000174193000003-
dc.publisher.placeUnited Kingdomen_HK
dc.identifier.scopusauthoridLoo, KK=36793892100en_HK
dc.identifier.scopusauthoridYip, CL=7101665547en_HK
dc.identifier.scopusauthoridKao, B=35221592600en_HK
dc.identifier.scopusauthoridCheung, D=34567902600en_HK
dc.identifier.issnl0306-4379-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats