File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Effect of Data Skewness in Parallel Mining of Association Rules

TitleEffect of Data Skewness in Parallel Mining of Association Rules
Authors
KeywordsAssociation Rules
Data Mining
Data Skewness
Parallel Computing
Issue Date1998
PublisherSpringer.
Citation
The 2nd Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-98), Melbourne, Australia, 15-17 April 1998. In Wu, X, Kotagiri, R and Korb, KB (Eds). Pacific-Asia Conference on Knowledge Discovery and Data Mining, p. 48-60. Berlin, Heidelberg: Springer, 1998 How to Cite?
AbstractAn efficient parallel algorithm FPM(Fast Parallel Mining) for mining association rules on a shared-nothing parallel system has been proposed. It adopts the count distribution approach and has incorporated two powerful candidate pruning techniques, i.e., distributed pruning and global pruning. It has a simple communication scheme which performs only one round of message exchange in each iteration. We found that the two pruning techniques are very sensitive to data skewness, which describes the degree of non-uniformity of the itemset distribution among the database partitions. Distributed pruning is very effective when data skewness is high. Global pruning is more effective than distributed pruning even for the mild data skewness case. We have implemented the algorithm on an IBM SP2 parallel machine. The performance studies confirm our observation on the relationship between the effectiveness of the two pruning techniques and data skewness. It has also shown that FPM outperforms CD (Count Distribution) consistently, which is a parallel version of the popular Apriori algorithm [2, 3]. Furthermore, FPM has nice parallelism of speedup, scaleup and sizeup.
Persistent Identifierhttp://hdl.handle.net/10722/93133
ISBN
ISSN
2023 SCImago Journal Rankings: 0.606

 

DC FieldValueLanguage
dc.contributor.authorCheung, DWLen_HK
dc.contributor.authorXiao, Yen_HK
dc.date.accessioned2010-09-25T14:51:53Z-
dc.date.available2010-09-25T14:51:53Z-
dc.date.issued1998en_HK
dc.identifier.citationThe 2nd Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-98), Melbourne, Australia, 15-17 April 1998. In Wu, X, Kotagiri, R and Korb, KB (Eds). Pacific-Asia Conference on Knowledge Discovery and Data Mining, p. 48-60. Berlin, Heidelberg: Springer, 1998-
dc.identifier.isbn978-3-540-64383-8-
dc.identifier.issn0302-9743-
dc.identifier.urihttp://hdl.handle.net/10722/93133-
dc.description.abstractAn efficient parallel algorithm FPM(Fast Parallel Mining) for mining association rules on a shared-nothing parallel system has been proposed. It adopts the count distribution approach and has incorporated two powerful candidate pruning techniques, i.e., distributed pruning and global pruning. It has a simple communication scheme which performs only one round of message exchange in each iteration. We found that the two pruning techniques are very sensitive to data skewness, which describes the degree of non-uniformity of the itemset distribution among the database partitions. Distributed pruning is very effective when data skewness is high. Global pruning is more effective than distributed pruning even for the mild data skewness case. We have implemented the algorithm on an IBM SP2 parallel machine. The performance studies confirm our observation on the relationship between the effectiveness of the two pruning techniques and data skewness. It has also shown that FPM outperforms CD (Count Distribution) consistently, which is a parallel version of the popular Apriori algorithm [2, 3]. Furthermore, FPM has nice parallelism of speedup, scaleup and sizeup.-
dc.languageengen_HK
dc.publisherSpringer.-
dc.relation.ispartofPacific-Asia Conference on Knowledge Discovery and Data Miningen_HK
dc.subjectAssociation Rules-
dc.subjectData Mining-
dc.subjectData Skewness-
dc.subjectParallel Computing-
dc.titleEffect of Data Skewness in Parallel Mining of Association Rulesen_HK
dc.typeConference_Paperen_HK
dc.identifier.emailCheung, DWL: dcheung@cs.hku.hken_HK
dc.identifier.authorityCheung, DWL=rp00101en_HK
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1007/3-540-64383-4_5-
dc.identifier.scopuseid_2-s2.0-84958976005-
dc.identifier.hkuros31077en_HK
dc.identifier.eissn1611-3349-
dc.identifier.issnl0302-9743-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats