File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Improved blog clustering through automated weighting of text blocks

TitleImproved blog clustering through automated weighting of text blocks
Authors
KeywordsBlog
Auto-weighted
Clustering
Web mining
Issue Date2009
PublisherI E E E. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000424
Citation
2009 International Conference on Machine Learning and Cybernetics, Baoding, China, 12-15 July 2009, v. 3, p. 1586-1591 How to Cite?
AbstractIn this paper, a new clustering algorithm is proposed for blog data clustering. Considering the structure information of text blocks in blog data, we group the features of blog data into three groups and extend the k-means clustering algorithm to automatically calculate a weight for each feature group in the clustering process. We introduce a new objective function with group weight variables and present the Lagrangian method to derive the formula to calculate the group weights. This formula is added as a new step in the standard k-means iterative clustering process to automatically compute the group weights according to the distribution of features. This new process guarantees the convergency of the clustering process to a local optimal solution. The experimental results have shown that this new algorithm performed better than k-means without group feature weighting on different blog data sets.
Persistent Identifierhttp://hdl.handle.net/10722/223758
ISBN
ISI Accession Number ID

 

DC FieldValueLanguage
dc.contributor.authorLi, HB-
dc.contributor.authorYe, YM-
dc.contributor.authorHuang, JZ-
dc.date.accessioned2016-03-14T07:43:53Z-
dc.date.available2016-03-14T07:43:53Z-
dc.date.issued2009-
dc.identifier.citation2009 International Conference on Machine Learning and Cybernetics, Baoding, China, 12-15 July 2009, v. 3, p. 1586-1591-
dc.identifier.isbn9781424437023-
dc.identifier.urihttp://hdl.handle.net/10722/223758-
dc.description.abstractIn this paper, a new clustering algorithm is proposed for blog data clustering. Considering the structure information of text blocks in blog data, we group the features of blog data into three groups and extend the k-means clustering algorithm to automatically calculate a weight for each feature group in the clustering process. We introduce a new objective function with group weight variables and present the Lagrangian method to derive the formula to calculate the group weights. This formula is added as a new step in the standard k-means iterative clustering process to automatically compute the group weights according to the distribution of features. This new process guarantees the convergency of the clustering process to a local optimal solution. The experimental results have shown that this new algorithm performed better than k-means without group feature weighting on different blog data sets.-
dc.languageeng-
dc.publisherI E E E. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000424-
dc.relation.ispartofProceedings of the International Conference on Machine Learning and Cybernetics-
dc.rights©2009 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.-
dc.subjectBlog-
dc.subjectAuto-weighted-
dc.subjectClustering-
dc.subjectWeb mining-
dc.titleImproved blog clustering through automated weighting of text blocks-
dc.typeConference_Paper-
dc.identifier.emailHuang, JZ: jhuang@eti.hku.hk-
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.1109/ICMLC.2009.5212352-
dc.identifier.scopuseid_2-s2.0-70350741542-
dc.identifier.hkuros164902-
dc.identifier.volume3-
dc.identifier.spage1586-
dc.identifier.epage1591-
dc.identifier.isiWOS:000281720400291-
dc.publisher.placeUnited States-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats