File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Scopus: eid_2-s2.0-27144466086
- WOS: WOS:000083963300004
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: Effect of data distribution in parallel mining of associations
Title | Effect of data distribution in parallel mining of associations |
---|---|
Authors | |
Keywords | Association rules Data mining Data skewness Parallel computing Parallel mining Workload balance |
Issue Date | 1999 |
Publisher | Springer New York LLC. The Journal's web site is located at http://springerlink.metapress.com/openurl.asp?genre=journal&issn=1384-5810 |
Citation | Data Mining And Knowledge Discovery, 1999, v. 3 n. 3, p. 291-314 How to Cite? |
Abstract | Association rule mining is an important new problem in data mining. It has crucial applications in decision support and marketing strategy. We proposed an efficient parallel algorithm for mining association rules on a distributed share-nothing parallel system. Its efficiency is attributed to the incorporation of two powerful candidate set pruning techniques. The two techniques, distributed and global prunings, are sensitive to two data distribution characteristics: data skewness and workload balance. The prunings are very effective when both the skewness and balance are high. We have implemented FPM on an IBM SP2 parallel system. The performance studies show that FPM outperforms CD consistently, which is a parallel version of the representative Apriori algorithm (Agrawal and Srikant, 1994). Also, the results have validated our observation on the effectiveness of the two pruning techniques with respect to the data distribution characteristics. Furthermore, it shows that FPM has nice scalability and parallelism, which can be tuned for different business applications. © 1999 Kluwer Academic Publishers. |
Persistent Identifier | http://hdl.handle.net/10722/118227 |
ISSN | 2023 Impact Factor: 2.8 2023 SCImago Journal Rankings: 1.813 |
ISI Accession Number ID | |
References |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Cheung, DW | en_HK |
dc.contributor.author | Xiao, Y | en_HK |
dc.date.accessioned | 2010-09-26T07:55:11Z | - |
dc.date.available | 2010-09-26T07:55:11Z | - |
dc.date.issued | 1999 | en_HK |
dc.identifier.citation | Data Mining And Knowledge Discovery, 1999, v. 3 n. 3, p. 291-314 | en_HK |
dc.identifier.issn | 1384-5810 | en_HK |
dc.identifier.uri | http://hdl.handle.net/10722/118227 | - |
dc.description.abstract | Association rule mining is an important new problem in data mining. It has crucial applications in decision support and marketing strategy. We proposed an efficient parallel algorithm for mining association rules on a distributed share-nothing parallel system. Its efficiency is attributed to the incorporation of two powerful candidate set pruning techniques. The two techniques, distributed and global prunings, are sensitive to two data distribution characteristics: data skewness and workload balance. The prunings are very effective when both the skewness and balance are high. We have implemented FPM on an IBM SP2 parallel system. The performance studies show that FPM outperforms CD consistently, which is a parallel version of the representative Apriori algorithm (Agrawal and Srikant, 1994). Also, the results have validated our observation on the effectiveness of the two pruning techniques with respect to the data distribution characteristics. Furthermore, it shows that FPM has nice scalability and parallelism, which can be tuned for different business applications. © 1999 Kluwer Academic Publishers. | en_HK |
dc.language | eng | en_HK |
dc.publisher | Springer New York LLC. The Journal's web site is located at http://springerlink.metapress.com/openurl.asp?genre=journal&issn=1384-5810 | en_HK |
dc.relation.ispartof | Data Mining and Knowledge Discovery | en_HK |
dc.subject | Association rules | en_HK |
dc.subject | Data mining | en_HK |
dc.subject | Data skewness | en_HK |
dc.subject | Parallel computing | en_HK |
dc.subject | Parallel mining | en_HK |
dc.subject | Workload balance | en_HK |
dc.title | Effect of data distribution in parallel mining of associations | en_HK |
dc.type | Article | en_HK |
dc.identifier.openurl | http://library.hku.hk:4550/resserv?sid=HKU:IR&issn=1384-5810&volume=&spage=&epage=&date=1999&atitle=Effect+of+Data+Distribution+in+Parallel+Mining+of+Associations | en_HK |
dc.identifier.email | Cheung, DW:dcheung@cs.hku.hk | en_HK |
dc.identifier.authority | Cheung, DW=rp00101 | en_HK |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.scopus | eid_2-s2.0-27144466086 | en_HK |
dc.identifier.hkuros | 47948 | en_HK |
dc.relation.references | http://www.scopus.com/mlt/select.url?eid=2-s2.0-27144466086&selection=ref&src=s&origin=recordpage | en_HK |
dc.identifier.volume | 3 | en_HK |
dc.identifier.issue | 3 | en_HK |
dc.identifier.spage | 291 | en_HK |
dc.identifier.epage | 314 | en_HK |
dc.identifier.isi | WOS:000083963300004 | - |
dc.publisher.place | United States | en_HK |
dc.identifier.scopusauthorid | Cheung, DW=34567902600 | en_HK |
dc.identifier.scopusauthorid | Xiao, Y=22735880100 | en_HK |
dc.identifier.issnl | 1384-5810 | - |