File Download
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/ICMLC.2005.1526911
- Scopus: eid_2-s2.0-28444444555
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Conference Paper: Chinese text chunking using lexicalized HMMS
Title | Chinese text chunking using lexicalized HMMS |
---|---|
Authors | |
Keywords | Base phrase recognition Base phrase structure Lexicalized hidden markov models (HMMs) Text chunking |
Issue Date | 2005 |
Publisher | IEEE. |
Citation | 2005 International Conference On Machine Learning And Cybernetics, (ICMLC 2005), Guangzhou, China, 18-21 August 2005. In 2005 International Conference on Machine Learning and Cybernetics, 2005, p. 7-12 How to Cite? |
Abstract | This paper presents a lexicalized HMM-based approach to Chinese text chunking. To tackle the problem of unknown words, we formalize Chinese text chunking as a tagging task on a sequence of known words. To do this, we employ the uniformly lexicalized HMMs and develop a lattice-based tagger to assign each known word a proper hybrid tag, which involves four types of information: word boundary, POS, chunk boundary and chunk type. In comparison with most previous approaches, our approach is able to integrate different features such as part-of-speech information, chunk-internal cues and contextual information for text chunking under the framework of HMMs. As a result, the performance of the system can be improved without losing its efficiency in training and tagging. Our preliminary experiments on the PolyU Shallow Treebank show that the use of lexicalization technique can substantially improve the performance of a HMM-based chunking system. © 2005 IEEE. |
Persistent Identifier | http://hdl.handle.net/10722/54212 |
References |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Fu, GH | en_HK |
dc.contributor.author | Xu, RF | en_HK |
dc.contributor.author | Luke, KK | en_HK |
dc.contributor.author | Lu, Q | en_HK |
dc.date.accessioned | 2009-04-03T07:39:49Z | - |
dc.date.available | 2009-04-03T07:39:49Z | - |
dc.date.issued | 2005 | en_HK |
dc.identifier.citation | 2005 International Conference On Machine Learning And Cybernetics, (ICMLC 2005), Guangzhou, China, 18-21 August 2005. In 2005 International Conference on Machine Learning and Cybernetics, 2005, p. 7-12 | en_HK |
dc.identifier.uri | http://hdl.handle.net/10722/54212 | - |
dc.description.abstract | This paper presents a lexicalized HMM-based approach to Chinese text chunking. To tackle the problem of unknown words, we formalize Chinese text chunking as a tagging task on a sequence of known words. To do this, we employ the uniformly lexicalized HMMs and develop a lattice-based tagger to assign each known word a proper hybrid tag, which involves four types of information: word boundary, POS, chunk boundary and chunk type. In comparison with most previous approaches, our approach is able to integrate different features such as part-of-speech information, chunk-internal cues and contextual information for text chunking under the framework of HMMs. As a result, the performance of the system can be improved without losing its efficiency in training and tagging. Our preliminary experiments on the PolyU Shallow Treebank show that the use of lexicalization technique can substantially improve the performance of a HMM-based chunking system. © 2005 IEEE. | en_HK |
dc.language | eng | en_HK |
dc.publisher | IEEE. | en_HK |
dc.relation.ispartof | 2005 International Conference on Machine Learning and Cybernetics | en_HK |
dc.rights | ©2005 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. | - |
dc.subject | Base phrase recognition | en_HK |
dc.subject | Base phrase structure | en_HK |
dc.subject | Lexicalized hidden markov models (HMMs) | en_HK |
dc.subject | Text chunking | en_HK |
dc.title | Chinese text chunking using lexicalized HMMS | en_HK |
dc.type | Conference_Paper | en_HK |
dc.identifier.email | Luke, KK:kkluke@hkusua.hku.hk | en_HK |
dc.identifier.authority | Luke, KK=rp01201 | en_HK |
dc.description.nature | published_or_final_version | en_HK |
dc.identifier.doi | 10.1109/ICMLC.2005.1526911 | - |
dc.identifier.scopus | eid_2-s2.0-28444444555 | en_HK |
dc.relation.references | http://www.scopus.com/mlt/select.url?eid=2-s2.0-28444444555&selection=ref&src=s&origin=recordpage | en_HK |
dc.identifier.spage | 7 | en_HK |
dc.identifier.epage | 12 | en_HK |
dc.identifier.scopusauthorid | Fu, GH=7202721096 | en_HK |
dc.identifier.scopusauthorid | Xu, RF=35520467000 | en_HK |
dc.identifier.scopusauthorid | Luke, KK=7003697439 | en_HK |
dc.identifier.scopusauthorid | Lu, Q=35242792400 | en_HK |