File Download
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/ICMLC.2004.1382245
- Scopus: eid_2-s2.0-6344285863
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Conference Paper: Chinese unknown word identification as known word tagging
Title | Chinese unknown word identification as known word tagging |
---|---|
Authors | |
Keywords | Chinese word segmentation Known word tagging Lexicalized HMMs Unknown word identification |
Issue Date | 2004 |
Publisher | IEEE. |
Citation | 2004 International Conference on Machine Learning and Cybernetics, Shanghai, China, 26-29 August 2004. In Proceedings of 2004 International Conference on Machine Learning and Cybernetics, 2004, v. 4, p. 2612-2617 How to Cite? |
Abstract | This paper presents a tagging approach to Chinese unknown word identification based on lexicalized hidden Markov models (LHMMs). In this work, Chinese unknown word identification is represented as a tagging task on a sequence of known words by introducing word-formation patterns and part-of-speech. Based on the lexicalized HMMs, a statistical tagger is further developed to assign each known word an appropriate tag that indicates its pattern in forming a word and the part-of-speech of the formed word. The experimental results on the Peking University corpus indicate that the use of lexicalization technique and the introduction of part-of-speech are helpful to unknown word identification. The experiment on the SIGHAN-PK open test data also shows that our system can achieve state-of-art performance. |
Persistent Identifier | http://hdl.handle.net/10722/47018 |
References |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Fu, GH | en_HK |
dc.contributor.author | Luke, KK | en_HK |
dc.date.accessioned | 2007-10-30T07:04:20Z | - |
dc.date.available | 2007-10-30T07:04:20Z | - |
dc.date.issued | 2004 | en_HK |
dc.identifier.citation | 2004 International Conference on Machine Learning and Cybernetics, Shanghai, China, 26-29 August 2004. In Proceedings of 2004 International Conference on Machine Learning and Cybernetics, 2004, v. 4, p. 2612-2617 | en_HK |
dc.identifier.uri | http://hdl.handle.net/10722/47018 | - |
dc.description.abstract | This paper presents a tagging approach to Chinese unknown word identification based on lexicalized hidden Markov models (LHMMs). In this work, Chinese unknown word identification is represented as a tagging task on a sequence of known words by introducing word-formation patterns and part-of-speech. Based on the lexicalized HMMs, a statistical tagger is further developed to assign each known word an appropriate tag that indicates its pattern in forming a word and the part-of-speech of the formed word. The experimental results on the Peking University corpus indicate that the use of lexicalization technique and the introduction of part-of-speech are helpful to unknown word identification. The experiment on the SIGHAN-PK open test data also shows that our system can achieve state-of-art performance. | en_HK |
dc.format.extent | 431821 bytes | - |
dc.format.extent | 2213 bytes | - |
dc.format.extent | 2608 bytes | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | text/plain | - |
dc.format.mimetype | text/plain | - |
dc.language | eng | en_HK |
dc.publisher | IEEE. | en_HK |
dc.relation.ispartof | Proceedings of 2004 International Conference on Machine Learning and Cybernetics | en_HK |
dc.rights | ©2004 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. | - |
dc.subject | Chinese word segmentation | en_HK |
dc.subject | Known word tagging | en_HK |
dc.subject | Lexicalized HMMs | en_HK |
dc.subject | Unknown word identification | en_HK |
dc.title | Chinese unknown word identification as known word tagging | en_HK |
dc.type | Conference_Paper | en_HK |
dc.identifier.email | Luke, KK:kkluke@hkusua.hku.hk | en_HK |
dc.identifier.authority | Luke, KK=rp01201 | en_HK |
dc.description.nature | published_or_final_version | en_HK |
dc.identifier.doi | 10.1109/ICMLC.2004.1382245 | - |
dc.identifier.scopus | eid_2-s2.0-6344285863 | en_HK |
dc.identifier.hkuros | 103505 | - |
dc.relation.references | http://www.scopus.com/mlt/select.url?eid=2-s2.0-6344285863&selection=ref&src=s&origin=recordpage | en_HK |
dc.identifier.volume | 4 | en_HK |
dc.identifier.spage | 2612 | en_HK |
dc.identifier.epage | 2617 | en_HK |
dc.identifier.scopusauthorid | Fu, GH=7202721096 | en_HK |
dc.identifier.scopusauthorid | Luke, KK=7003697439 | en_HK |