File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Parallel mining of outliers in large database

TitleParallel mining of outliers in large database
Authors
KeywordsData mining
Outlier detection
Parallel algorithm
Issue Date2002
PublisherSpringer New York LLC. The Journal's web site is located at http://springerlink.metapress.com/openurl.asp?genre=journal&issn=0926-8782
Citation
Distributed And Parallel Databases, 2002, v. 12 n. 1, p. 5-26 How to Cite?
AbstractData mining is a new, important and fast growing database application. Outlier (exception) detection is one kind of data mining, which can be applied in a variety of areas like monitoring of credit card fraud and criminal activities in electronic commerce. With the ever-increasing size and attributes (dimensions) of database. previously proposed detection methods for two dimensions are no longer applicable. The time complexity of the Nested-Loop (NL) algorithm (Knorr and Ng, in Proc. 24th VLDB, 1998) is linear to the dimensionality but quadratic to the dataset size, inducing an unacceptable cost for large dataset. A more efficient version (ENL) and its parallel version (PENL) are introduced. In theory, the improvement of performance in PENL is linear to the number of processors, as shown in a performance comparison between ENL and PENL using Bulk Synchronization Parallel (BSP) model. The great improvement is further verified by experiments on a parallel computer system IBM 9076 SP2. The results show that it ms a very good choice to mine outliers in a cluster of workstations with a low-cost interconnected by a commodity communication network.
Persistent Identifierhttp://hdl.handle.net/10722/89005
ISSN
2015 Impact Factor: 0.8
2015 SCImago Journal Rankings: 0.593
ISI Accession Number ID
References

 

DC FieldValueLanguage
dc.contributor.authorHung, Een_HK
dc.contributor.authorCheung, DWen_HK
dc.date.accessioned2010-09-06T09:51:12Z-
dc.date.available2010-09-06T09:51:12Z-
dc.date.issued2002en_HK
dc.identifier.citationDistributed And Parallel Databases, 2002, v. 12 n. 1, p. 5-26en_HK
dc.identifier.issn0926-8782en_HK
dc.identifier.urihttp://hdl.handle.net/10722/89005-
dc.description.abstractData mining is a new, important and fast growing database application. Outlier (exception) detection is one kind of data mining, which can be applied in a variety of areas like monitoring of credit card fraud and criminal activities in electronic commerce. With the ever-increasing size and attributes (dimensions) of database. previously proposed detection methods for two dimensions are no longer applicable. The time complexity of the Nested-Loop (NL) algorithm (Knorr and Ng, in Proc. 24th VLDB, 1998) is linear to the dimensionality but quadratic to the dataset size, inducing an unacceptable cost for large dataset. A more efficient version (ENL) and its parallel version (PENL) are introduced. In theory, the improvement of performance in PENL is linear to the number of processors, as shown in a performance comparison between ENL and PENL using Bulk Synchronization Parallel (BSP) model. The great improvement is further verified by experiments on a parallel computer system IBM 9076 SP2. The results show that it ms a very good choice to mine outliers in a cluster of workstations with a low-cost interconnected by a commodity communication network.en_HK
dc.languageengen_HK
dc.publisherSpringer New York LLC. The Journal's web site is located at http://springerlink.metapress.com/openurl.asp?genre=journal&issn=0926-8782en_HK
dc.relation.ispartofDistributed and Parallel Databasesen_HK
dc.subjectData miningen_HK
dc.subjectOutlier detectionen_HK
dc.subjectParallel algorithmen_HK
dc.titleParallel mining of outliers in large databaseen_HK
dc.typeArticleen_HK
dc.identifier.openurlhttp://library.hku.hk:4550/resserv?sid=HKU:IR&issn=0926-8782&volume=12&spage=5&epage=26&date=2002&atitle=Parallel+Mining+of+Outliers+in+Large+Databaseen_HK
dc.identifier.emailCheung, DW:dcheung@cs.hku.hken_HK
dc.identifier.authorityCheung, DW=rp00101en_HK
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1023/A:1015608814486en_HK
dc.identifier.scopuseid_2-s2.0-0036644801en_HK
dc.identifier.hkuros70917en_HK
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-0036644801&selection=ref&src=s&origin=recordpageen_HK
dc.identifier.volume12en_HK
dc.identifier.issue1en_HK
dc.identifier.spage5en_HK
dc.identifier.epage26en_HK
dc.identifier.isiWOS:000175855700001-
dc.publisher.placeUnited Statesen_HK
dc.identifier.scopusauthoridHung, E=7004256336en_HK
dc.identifier.scopusauthoridCheung, DW=34567902600en_HK

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats