File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: On discovery of extremely low-dimensional clusters using semi-supervised projected clustering

TitleOn discovery of extremely low-dimensional clusters using semi-supervised projected clustering
Authors
KeywordsComputers
Computer engineering
Issue Date2005
PublisherIEEE, Computer Society.
Citation
Proceedings - International Conference On Data Engineering, 2005, p. 329-340 How to Cite?
AbstractRecent studies suggest that projected clusters with extremely low dimensionality exist in many real datasets. A number of projected clustering algorithms have been proposed in the past several years, but few can identify clusters with dimensionality lower than 10% of the total number of dimensions, which are commonly found in some real datasets such as gene expression profiles. In this paper we propose a new algorithm that can accurately identify projected clusters with relevant dimensions as few as 5% of the total number of dimensions. It makes use of a robust objective function that combines object clustering and dimension selection into a single optimization problem. The algorithm can also utilize domain knowledge in the form of labeled objects and labeled dimensions to improve its clustering accuracy. We believe this is the first semi-supervised projected clustering algorithm. Both theoretical analysis and experimental results show that by using a small amount of input knowledge, possibly covering only a portion of the underlying classes, the new algorithm can be further improved to accurately detect clusters with only 1% of the dimensions being relevant. The algorithm is also useful in getting a target set of clusters when there are multiple possible groupings of the objects. © 2005 IEEE.
Persistent Identifierhttp://hdl.handle.net/10722/46596
ISSN
References

 

DC FieldValueLanguage
dc.contributor.authorYip, KYen_HK
dc.contributor.authorCheung, DWen_HK
dc.contributor.authorNg, MKen_HK
dc.date.accessioned2007-10-30T06:53:48Z-
dc.date.available2007-10-30T06:53:48Z-
dc.date.issued2005en_HK
dc.identifier.citationProceedings - International Conference On Data Engineering, 2005, p. 329-340en_HK
dc.identifier.issn1084-4627en_HK
dc.identifier.urihttp://hdl.handle.net/10722/46596-
dc.description.abstractRecent studies suggest that projected clusters with extremely low dimensionality exist in many real datasets. A number of projected clustering algorithms have been proposed in the past several years, but few can identify clusters with dimensionality lower than 10% of the total number of dimensions, which are commonly found in some real datasets such as gene expression profiles. In this paper we propose a new algorithm that can accurately identify projected clusters with relevant dimensions as few as 5% of the total number of dimensions. It makes use of a robust objective function that combines object clustering and dimension selection into a single optimization problem. The algorithm can also utilize domain knowledge in the form of labeled objects and labeled dimensions to improve its clustering accuracy. We believe this is the first semi-supervised projected clustering algorithm. Both theoretical analysis and experimental results show that by using a small amount of input knowledge, possibly covering only a portion of the underlying classes, the new algorithm can be further improved to accurately detect clusters with only 1% of the dimensions being relevant. The algorithm is also useful in getting a target set of clusters when there are multiple possible groupings of the objects. © 2005 IEEE.en_HK
dc.format.extent280893 bytes-
dc.format.extent2254 bytes-
dc.format.extent6619 bytes-
dc.format.mimetypeapplication/pdf-
dc.format.mimetypetext/plain-
dc.format.mimetypetext/plain-
dc.languageengen_HK
dc.publisherIEEE, Computer Society.en_HK
dc.relation.ispartofProceedings - International Conference on Data Engineeringen_HK
dc.rightsCreative Commons: Attribution 3.0 Hong Kong License-
dc.rights©2005 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.en_HK
dc.subjectComputersen_HK
dc.subjectComputer engineeringen_HK
dc.titleOn discovery of extremely low-dimensional clusters using semi-supervised projected clusteringen_HK
dc.typeConference_Paperen_HK
dc.identifier.openurlhttp://library.hku.hk:4550/resserv?sid=HKU:IR&issn=1084-4627&volume=&spage=329&epage=340&date=2005&atitle=On+discovery+of+extremely+low-dimensional+clusters+using+semi-supervised+projected+clusteringen_HK
dc.identifier.emailCheung, DW:dcheung@cs.hku.hken_HK
dc.identifier.authorityCheung, DW=rp00101en_HK
dc.description.naturepublished_or_final_versionen_HK
dc.identifier.doi10.1109/ICDE.2005.96en_HK
dc.identifier.scopuseid_2-s2.0-28444491389en_HK
dc.identifier.hkuros103214-
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-28444491389&selection=ref&src=s&origin=recordpageen_HK
dc.identifier.spage329en_HK
dc.identifier.epage340en_HK
dc.publisher.placeUnited Statesen_HK
dc.identifier.scopusauthoridYip, KY=7101909946en_HK
dc.identifier.scopusauthoridCheung, DW=34567902600en_HK
dc.identifier.scopusauthoridNg, MK=7202076432en_HK
dc.identifier.citeulike2838608-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats