File Download
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/ICDE.2005.96
- Scopus: eid_2-s2.0-28444491389
- Find via
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Conference Paper: On discovery of extremely low-dimensional clusters using semi-supervised projected clustering
Title | On discovery of extremely low-dimensional clusters using semi-supervised projected clustering |
---|---|
Authors | |
Keywords | Computers Computer engineering |
Issue Date | 2005 |
Publisher | IEEE, Computer Society. |
Citation | Proceedings - International Conference On Data Engineering, 2005, p. 329-340 How to Cite? |
Abstract | Recent studies suggest that projected clusters with extremely low dimensionality exist in many real datasets. A number of projected clustering algorithms have been proposed in the past several years, but few can identify clusters with dimensionality lower than 10% of the total number of dimensions, which are commonly found in some real datasets such as gene expression profiles. In this paper we propose a new algorithm that can accurately identify projected clusters with relevant dimensions as few as 5% of the total number of dimensions. It makes use of a robust objective function that combines object clustering and dimension selection into a single optimization problem. The algorithm can also utilize domain knowledge in the form of labeled objects and labeled dimensions to improve its clustering accuracy. We believe this is the first semi-supervised projected clustering algorithm. Both theoretical analysis and experimental results show that by using a small amount of input knowledge, possibly covering only a portion of the underlying classes, the new algorithm can be further improved to accurately detect clusters with only 1% of the dimensions being relevant. The algorithm is also useful in getting a target set of clusters when there are multiple possible groupings of the objects. © 2005 IEEE. |
Persistent Identifier | http://hdl.handle.net/10722/46596 |
ISSN | 2023 SCImago Journal Rankings: 1.306 |
References |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Yip, KY | en_HK |
dc.contributor.author | Cheung, DW | en_HK |
dc.contributor.author | Ng, MK | en_HK |
dc.date.accessioned | 2007-10-30T06:53:48Z | - |
dc.date.available | 2007-10-30T06:53:48Z | - |
dc.date.issued | 2005 | en_HK |
dc.identifier.citation | Proceedings - International Conference On Data Engineering, 2005, p. 329-340 | en_HK |
dc.identifier.issn | 1084-4627 | en_HK |
dc.identifier.uri | http://hdl.handle.net/10722/46596 | - |
dc.description.abstract | Recent studies suggest that projected clusters with extremely low dimensionality exist in many real datasets. A number of projected clustering algorithms have been proposed in the past several years, but few can identify clusters with dimensionality lower than 10% of the total number of dimensions, which are commonly found in some real datasets such as gene expression profiles. In this paper we propose a new algorithm that can accurately identify projected clusters with relevant dimensions as few as 5% of the total number of dimensions. It makes use of a robust objective function that combines object clustering and dimension selection into a single optimization problem. The algorithm can also utilize domain knowledge in the form of labeled objects and labeled dimensions to improve its clustering accuracy. We believe this is the first semi-supervised projected clustering algorithm. Both theoretical analysis and experimental results show that by using a small amount of input knowledge, possibly covering only a portion of the underlying classes, the new algorithm can be further improved to accurately detect clusters with only 1% of the dimensions being relevant. The algorithm is also useful in getting a target set of clusters when there are multiple possible groupings of the objects. © 2005 IEEE. | en_HK |
dc.format.extent | 280893 bytes | - |
dc.format.extent | 2254 bytes | - |
dc.format.extent | 6619 bytes | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | text/plain | - |
dc.format.mimetype | text/plain | - |
dc.language | eng | en_HK |
dc.publisher | IEEE, Computer Society. | en_HK |
dc.relation.ispartof | Proceedings - International Conference on Data Engineering | en_HK |
dc.rights | ©2005 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. | - |
dc.subject | Computers | en_HK |
dc.subject | Computer engineering | en_HK |
dc.title | On discovery of extremely low-dimensional clusters using semi-supervised projected clustering | en_HK |
dc.type | Conference_Paper | en_HK |
dc.identifier.openurl | http://library.hku.hk:4550/resserv?sid=HKU:IR&issn=1084-4627&volume=&spage=329&epage=340&date=2005&atitle=On+discovery+of+extremely+low-dimensional+clusters+using+semi-supervised+projected+clustering | en_HK |
dc.identifier.email | Cheung, DW:dcheung@cs.hku.hk | en_HK |
dc.identifier.authority | Cheung, DW=rp00101 | en_HK |
dc.description.nature | published_or_final_version | en_HK |
dc.identifier.doi | 10.1109/ICDE.2005.96 | en_HK |
dc.identifier.scopus | eid_2-s2.0-28444491389 | en_HK |
dc.identifier.hkuros | 103214 | - |
dc.relation.references | http://www.scopus.com/mlt/select.url?eid=2-s2.0-28444491389&selection=ref&src=s&origin=recordpage | en_HK |
dc.identifier.spage | 329 | en_HK |
dc.identifier.epage | 340 | en_HK |
dc.publisher.place | United States | en_HK |
dc.identifier.scopusauthorid | Yip, KY=7101909946 | en_HK |
dc.identifier.scopusauthorid | Cheung, DW=34567902600 | en_HK |
dc.identifier.scopusauthorid | Ng, MK=7202076432 | en_HK |
dc.identifier.citeulike | 2838608 | - |
dc.identifier.issnl | 1084-4627 | - |