File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: A discriminative and semantic feature selection method for text categorization

TitleA discriminative and semantic feature selection method for text categorization
Authors
Issue Date2015
Citation
International Journal of Production Economics (In press), 2015 How to Cite?
AbstractText categorization is an important and critical task in the current era of high volume data storage and handling. Feature selection is obviously one of the most important steps in text categorization. Traditional feature selection methods tend to only consider the correlation between features and categories, and have in the main ignored the semantic similarity between features and documents. To further explore this issue, this paper proposes a novel feature selection method that first selects features in documents with discriminative power and then computes the semantic similarity between features and documents. The proposed feature selection method is tested using a support vector machine (SVM) classifier upon two published datasets, viz. Reuters-21578 and 20-Newsgroups. The experimental results show that the proposed feature selection method generally outperforms the traditional feature selection methods for text categorization for both published datasets.
Persistent Identifierhttp://hdl.handle.net/10722/208209

 

DC FieldValueLanguage
dc.contributor.authorZong, Wen_US
dc.contributor.authorWu, Fen_US
dc.contributor.authorChu, LKen_US
dc.contributor.authorSculli, Den_US
dc.date.accessioned2015-02-23T08:07:57Z-
dc.date.available2015-02-23T08:07:57Z-
dc.date.issued2015en_US
dc.identifier.citationInternational Journal of Production Economics (In press), 2015en_US
dc.identifier.urihttp://hdl.handle.net/10722/208209-
dc.description.abstractText categorization is an important and critical task in the current era of high volume data storage and handling. Feature selection is obviously one of the most important steps in text categorization. Traditional feature selection methods tend to only consider the correlation between features and categories, and have in the main ignored the semantic similarity between features and documents. To further explore this issue, this paper proposes a novel feature selection method that first selects features in documents with discriminative power and then computes the semantic similarity between features and documents. The proposed feature selection method is tested using a support vector machine (SVM) classifier upon two published datasets, viz. Reuters-21578 and 20-Newsgroups. The experimental results show that the proposed feature selection method generally outperforms the traditional feature selection methods for text categorization for both published datasets.en_US
dc.languageengen_US
dc.relation.ispartofInternational Journal of Production Economicsen_US
dc.titleA discriminative and semantic feature selection method for text categorizationen_US
dc.typeArticleen_US
dc.identifier.emailChu, LK: lkchu@hkucc.hku.hken_US
dc.identifier.emailSculli, D: hreidsc@hkucc.hku.hken_US
dc.identifier.authorityChu, LK=rp00113en_US
dc.identifier.doi10.1016/j.ijpe.2014.12.035en_US
dc.identifier.hkuros242307en_US

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats