File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1007/3-540-64383-4_5
- Scopus: eid_2-s2.0-84958976005
- Find via
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Conference Paper: Effect of Data Skewness in Parallel Mining of Association Rules
Title | Effect of Data Skewness in Parallel Mining of Association Rules |
---|---|
Authors | |
Keywords | Association Rules Data Mining Data Skewness Parallel Computing |
Issue Date | 1998 |
Publisher | Springer. |
Citation | The 2nd Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-98), Melbourne, Australia, 15-17 April 1998. In Wu, X, Kotagiri, R and Korb, KB (Eds). Pacific-Asia Conference on Knowledge Discovery and Data Mining, p. 48-60. Berlin, Heidelberg: Springer, 1998 How to Cite? |
Abstract | An efficient parallel algorithm FPM(Fast Parallel Mining) for mining association rules on a shared-nothing parallel system has been proposed. It adopts the count distribution approach and has incorporated two powerful candidate pruning techniques, i.e., distributed pruning and global pruning. It has a simple communication scheme which performs only one round of message exchange in each iteration. We found that the two pruning techniques are very sensitive to data skewness, which describes the degree of non-uniformity of the itemset distribution among the database partitions. Distributed pruning is very effective when data skewness is high. Global pruning is more effective than distributed pruning even for the mild data skewness case. We have implemented the algorithm on an IBM SP2 parallel machine. The performance studies confirm our observation on the relationship between the effectiveness of the two pruning techniques and data skewness. It has also shown that FPM outperforms CD (Count Distribution) consistently, which is a parallel version of the popular Apriori algorithm [2, 3]. Furthermore, FPM has nice parallelism of speedup, scaleup and sizeup. |
Persistent Identifier | http://hdl.handle.net/10722/93133 |
ISBN | |
ISSN | 2023 SCImago Journal Rankings: 0.606 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Cheung, DWL | en_HK |
dc.contributor.author | Xiao, Y | en_HK |
dc.date.accessioned | 2010-09-25T14:51:53Z | - |
dc.date.available | 2010-09-25T14:51:53Z | - |
dc.date.issued | 1998 | en_HK |
dc.identifier.citation | The 2nd Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-98), Melbourne, Australia, 15-17 April 1998. In Wu, X, Kotagiri, R and Korb, KB (Eds). Pacific-Asia Conference on Knowledge Discovery and Data Mining, p. 48-60. Berlin, Heidelberg: Springer, 1998 | - |
dc.identifier.isbn | 978-3-540-64383-8 | - |
dc.identifier.issn | 0302-9743 | - |
dc.identifier.uri | http://hdl.handle.net/10722/93133 | - |
dc.description.abstract | An efficient parallel algorithm FPM(Fast Parallel Mining) for mining association rules on a shared-nothing parallel system has been proposed. It adopts the count distribution approach and has incorporated two powerful candidate pruning techniques, i.e., distributed pruning and global pruning. It has a simple communication scheme which performs only one round of message exchange in each iteration. We found that the two pruning techniques are very sensitive to data skewness, which describes the degree of non-uniformity of the itemset distribution among the database partitions. Distributed pruning is very effective when data skewness is high. Global pruning is more effective than distributed pruning even for the mild data skewness case. We have implemented the algorithm on an IBM SP2 parallel machine. The performance studies confirm our observation on the relationship between the effectiveness of the two pruning techniques and data skewness. It has also shown that FPM outperforms CD (Count Distribution) consistently, which is a parallel version of the popular Apriori algorithm [2, 3]. Furthermore, FPM has nice parallelism of speedup, scaleup and sizeup. | - |
dc.language | eng | en_HK |
dc.publisher | Springer. | - |
dc.relation.ispartof | Pacific-Asia Conference on Knowledge Discovery and Data Mining | en_HK |
dc.subject | Association Rules | - |
dc.subject | Data Mining | - |
dc.subject | Data Skewness | - |
dc.subject | Parallel Computing | - |
dc.title | Effect of Data Skewness in Parallel Mining of Association Rules | en_HK |
dc.type | Conference_Paper | en_HK |
dc.identifier.email | Cheung, DWL: dcheung@cs.hku.hk | en_HK |
dc.identifier.authority | Cheung, DWL=rp00101 | en_HK |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1007/3-540-64383-4_5 | - |
dc.identifier.scopus | eid_2-s2.0-84958976005 | - |
dc.identifier.hkuros | 31077 | en_HK |
dc.identifier.eissn | 1611-3349 | - |
dc.identifier.issnl | 0302-9743 | - |