A lattice-based approach for I/O efficient association rule mining

Loo, KK; Yip, CL; Kao, B; Cheung, D

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1016/S0306-4379(01)00046-1
Scopus: eid_2-s2.0-0036498208
WOS: WOS:000174193000003
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Computer Science: Journal/Magazine Articles

Article: A lattice-based approach for I/O efficient association rule mining

Title	A lattice-based approach for I/O efficient association rule mining
Authors	Loo, KK Yip, CL Kao, B Cheung, D
Keywords	Apriori Association rules Data mining FindLarge Lattice LGen
Issue Date	2002
Publisher	Pergamon. The Journal's web site is located at http://www.elsevier.com/locate/is
Citation	Information Systems, 2002, v. 27 n. 1, p. 41-74 How to Cite? DOI: http://dx.doi.org/10.1016/S0306-4379(01)00046-1
Abstract	Most algorithms for association rule mining are variants of the basic Apriori algorithm (Agarwal and Srikant, Fast algorithms for mining association rules in databases, in: Proceedings of the 20th International Conference on Very Large Data Bases (VLDB'94), Santiago, Chile, 1994, pp. 487-499). One characteristic of these Apriori-based algorithms is that candidate itemsets are generated in rounds, with the size of the itemsets incremented by one per round. The number of database scans required by Apriori-based algorithms thus depends on the size of the biggest frequent itemsets. In this paper, we devise a more general candidate set generation algorithm, LGen, which generates candidate itemsets of multiple sizes during each database scan. We present an algorithm FindLarge which uses LGen to find frequent itemsets. We show that, given a reasonable set of suggested frequent itemsets, FindLarge can significantly reduce the number of I/O passes required. In the best cases, only two passes are sufficient to discover all the frequent itemsets irrespective of the size of the biggest ones. Two I/O-saving algorithms, namely DIC and Pincher-Search, are compared with FindLarge in a series of experiments. We discuss the conditions under which FindLarge significantly outperforms the others in terms of I/O efficiency. © 2002 Elsevier Science Ltd. All rights reserved.
Persistent Identifier	http://hdl.handle.net/10722/89013
ISSN	0306-4379 2023 Impact Factor: 3.0 2023 SCImago Journal Rankings: 1.201
ISI Accession Number ID	WOS:000174193000003
References	References in Scopus

DC Field	Value	Language
dc.contributor.author	Loo, KK	en_HK
dc.contributor.author	Yip, CL	en_HK
dc.contributor.author	Kao, B	en_HK
dc.contributor.author	Cheung, D	en_HK
dc.date.accessioned	2010-09-06T09:51:18Z	-
dc.date.available	2010-09-06T09:51:18Z	-
dc.date.issued	2002	en_HK
dc.identifier.citation	Information Systems, 2002, v. 27 n. 1, p. 41-74	en_HK
dc.identifier.issn	0306-4379	en_HK
dc.identifier.uri	http://hdl.handle.net/10722/89013	-
dc.description.abstract	Most algorithms for association rule mining are variants of the basic Apriori algorithm (Agarwal and Srikant, Fast algorithms for mining association rules in databases, in: Proceedings of the 20th International Conference on Very Large Data Bases (VLDB'94), Santiago, Chile, 1994, pp. 487-499). One characteristic of these Apriori-based algorithms is that candidate itemsets are generated in rounds, with the size of the itemsets incremented by one per round. The number of database scans required by Apriori-based algorithms thus depends on the size of the biggest frequent itemsets. In this paper, we devise a more general candidate set generation algorithm, LGen, which generates candidate itemsets of multiple sizes during each database scan. We present an algorithm FindLarge which uses LGen to find frequent itemsets. We show that, given a reasonable set of suggested frequent itemsets, FindLarge can significantly reduce the number of I/O passes required. In the best cases, only two passes are sufficient to discover all the frequent itemsets irrespective of the size of the biggest ones. Two I/O-saving algorithms, namely DIC and Pincher-Search, are compared with FindLarge in a series of experiments. We discuss the conditions under which FindLarge significantly outperforms the others in terms of I/O efficiency. © 2002 Elsevier Science Ltd. All rights reserved.	en_HK
dc.language	eng	en_HK
dc.publisher	Pergamon. The Journal's web site is located at http://www.elsevier.com/locate/is	en_HK
dc.relation.ispartof	Information Systems	en_HK
dc.subject	Apriori	en_HK
dc.subject	Association rules	en_HK
dc.subject	Data mining	en_HK
dc.subject	FindLarge	en_HK
dc.subject	Lattice	en_HK
dc.subject	LGen	en_HK
dc.title	A lattice-based approach for I/O efficient association rule mining	en_HK
dc.type	Article	en_HK
dc.identifier.openurl	http://library.hku.hk:4550/resserv?sid=HKU:IR&issn=0306-4379&volume=27&issue=1&spage=41&epage=74&date=2002&atitle=A+Lattice-Based+Approach+for+I/O+Efficient+Association+Rule+Mining	en_HK
dc.identifier.email	Yip, CL:clyip@cs.hku.hk	en_HK
dc.identifier.email	Kao, B:kao@cs.hku.hk	en_HK
dc.identifier.email	Cheung, D:dcheung@cs.hku.hk	en_HK
dc.identifier.authority	Yip, CL=rp00205	en_HK
dc.identifier.authority	Kao, B=rp00123	en_HK
dc.identifier.authority	Cheung, D=rp00101	en_HK
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1016/S0306-4379(01)00046-1	en_HK
dc.identifier.scopus	eid_2-s2.0-0036498208	en_HK
dc.identifier.hkuros	67151	en_HK
dc.relation.references	http://www.scopus.com/mlt/select.url?eid=2-s2.0-0036498208&selection=ref&src=s&origin=recordpage	en_HK
dc.identifier.volume	27	en_HK
dc.identifier.issue	1	en_HK
dc.identifier.spage	41	en_HK
dc.identifier.epage	74	en_HK
dc.identifier.isi	WOS:000174193000003	-
dc.publisher.place	United Kingdom	en_HK
dc.identifier.scopusauthorid	Loo, KK=36793892100	en_HK
dc.identifier.scopusauthorid	Yip, CL=7101665547	en_HK
dc.identifier.scopusauthorid	Kao, B=35221592600	en_HK
dc.identifier.scopusauthorid	Cheung, D=34567902600	en_HK
dc.identifier.issnl	0306-4379	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: A lattice-based approach for I/O efficient association rule mining

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats