File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1023/A:1015608814486
- Scopus: eid_2-s2.0-0036644801
- WOS: WOS:000175855700001
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: Parallel mining of outliers in large database
Title | Parallel mining of outliers in large database |
---|---|
Authors | |
Keywords | Data mining Outlier detection Parallel algorithm |
Issue Date | 2002 |
Publisher | Springer New York LLC. The Journal's web site is located at http://springerlink.metapress.com/openurl.asp?genre=journal&issn=0926-8782 |
Citation | Distributed And Parallel Databases, 2002, v. 12 n. 1, p. 5-26 How to Cite? |
Abstract | Data mining is a new, important and fast growing database application. Outlier (exception) detection is one kind of data mining, which can be applied in a variety of areas like monitoring of credit card fraud and criminal activities in electronic commerce. With the ever-increasing size and attributes (dimensions) of database. previously proposed detection methods for two dimensions are no longer applicable. The time complexity of the Nested-Loop (NL) algorithm (Knorr and Ng, in Proc. 24th VLDB, 1998) is linear to the dimensionality but quadratic to the dataset size, inducing an unacceptable cost for large dataset. A more efficient version (ENL) and its parallel version (PENL) are introduced. In theory, the improvement of performance in PENL is linear to the number of processors, as shown in a performance comparison between ENL and PENL using Bulk Synchronization Parallel (BSP) model. The great improvement is further verified by experiments on a parallel computer system IBM 9076 SP2. The results show that it ms a very good choice to mine outliers in a cluster of workstations with a low-cost interconnected by a commodity communication network. |
Persistent Identifier | http://hdl.handle.net/10722/89005 |
ISSN | 2023 Impact Factor: 1.5 2023 SCImago Journal Rankings: 0.442 |
ISI Accession Number ID | |
References |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Hung, E | en_HK |
dc.contributor.author | Cheung, DW | en_HK |
dc.date.accessioned | 2010-09-06T09:51:12Z | - |
dc.date.available | 2010-09-06T09:51:12Z | - |
dc.date.issued | 2002 | en_HK |
dc.identifier.citation | Distributed And Parallel Databases, 2002, v. 12 n. 1, p. 5-26 | en_HK |
dc.identifier.issn | 0926-8782 | en_HK |
dc.identifier.uri | http://hdl.handle.net/10722/89005 | - |
dc.description.abstract | Data mining is a new, important and fast growing database application. Outlier (exception) detection is one kind of data mining, which can be applied in a variety of areas like monitoring of credit card fraud and criminal activities in electronic commerce. With the ever-increasing size and attributes (dimensions) of database. previously proposed detection methods for two dimensions are no longer applicable. The time complexity of the Nested-Loop (NL) algorithm (Knorr and Ng, in Proc. 24th VLDB, 1998) is linear to the dimensionality but quadratic to the dataset size, inducing an unacceptable cost for large dataset. A more efficient version (ENL) and its parallel version (PENL) are introduced. In theory, the improvement of performance in PENL is linear to the number of processors, as shown in a performance comparison between ENL and PENL using Bulk Synchronization Parallel (BSP) model. The great improvement is further verified by experiments on a parallel computer system IBM 9076 SP2. The results show that it ms a very good choice to mine outliers in a cluster of workstations with a low-cost interconnected by a commodity communication network. | en_HK |
dc.language | eng | en_HK |
dc.publisher | Springer New York LLC. The Journal's web site is located at http://springerlink.metapress.com/openurl.asp?genre=journal&issn=0926-8782 | en_HK |
dc.relation.ispartof | Distributed and Parallel Databases | en_HK |
dc.subject | Data mining | en_HK |
dc.subject | Outlier detection | en_HK |
dc.subject | Parallel algorithm | en_HK |
dc.title | Parallel mining of outliers in large database | en_HK |
dc.type | Article | en_HK |
dc.identifier.openurl | http://library.hku.hk:4550/resserv?sid=HKU:IR&issn=0926-8782&volume=12&spage=5&epage=26&date=2002&atitle=Parallel+Mining+of+Outliers+in+Large+Database | en_HK |
dc.identifier.email | Cheung, DW:dcheung@cs.hku.hk | en_HK |
dc.identifier.authority | Cheung, DW=rp00101 | en_HK |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1023/A:1015608814486 | en_HK |
dc.identifier.scopus | eid_2-s2.0-0036644801 | en_HK |
dc.identifier.hkuros | 70917 | en_HK |
dc.relation.references | http://www.scopus.com/mlt/select.url?eid=2-s2.0-0036644801&selection=ref&src=s&origin=recordpage | en_HK |
dc.identifier.volume | 12 | en_HK |
dc.identifier.issue | 1 | en_HK |
dc.identifier.spage | 5 | en_HK |
dc.identifier.epage | 26 | en_HK |
dc.identifier.isi | WOS:000175855700001 | - |
dc.publisher.place | United States | en_HK |
dc.identifier.scopusauthorid | Hung, E=7004256336 | en_HK |
dc.identifier.scopusauthorid | Cheung, DW=34567902600 | en_HK |
dc.identifier.issnl | 0926-8782 | - |