File Download
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/ICASSP.1996.540324
- Find via
Supplementary
-
Citations:
- Appears in Collections:
Conference Paper: Ergodic multigram HMM integrating word segmentation and classtagging for Chinese language modeling
Title | Ergodic multigram HMM integrating word segmentation and classtagging for Chinese language modeling |
---|---|
Authors | |
Keywords | Engineering Electrical engineering |
Issue Date | 1996 |
Publisher | IEEE. |
Citation | IEEE International Conference on Acoustics, Speech and Signal Processing Proceedings, Atlanta, GA, USA, 7-10 May 1996, v. 1, p. 196-199 How to Cite? |
Abstract | A novel ergodic multigram hidden Markov model (HMM) is introduced which models sentence production as a doubly stochastic process, in which word classes are first produced according to a first order Markov model, and then single or multi-character words are generated independently based on the word classes, without word boundary marked on the sentence. This model can be applied to languages without word boundary markers such as Chinese. With a lexicon containing syntactic classes for each word, its applications include language modeling for recognizers, and integrated word segmentation and class tagging. Pre-segmented and tagged corpus are not needed for training, and both segmentation and tagging are trained in one single model. In this paper, relevant algorithms for this model are presented, and experimental results on a Chinese news corpus are reported. |
Persistent Identifier | http://hdl.handle.net/10722/45536 |
ISSN |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Law, HHC | en_HK |
dc.contributor.author | Chan, C | en_HK |
dc.date.accessioned | 2007-10-30T06:28:42Z | - |
dc.date.available | 2007-10-30T06:28:42Z | - |
dc.date.issued | 1996 | en_HK |
dc.identifier.citation | IEEE International Conference on Acoustics, Speech and Signal Processing Proceedings, Atlanta, GA, USA, 7-10 May 1996, v. 1, p. 196-199 | en_HK |
dc.identifier.issn | 1520-6149 | en_HK |
dc.identifier.uri | http://hdl.handle.net/10722/45536 | - |
dc.description.abstract | A novel ergodic multigram hidden Markov model (HMM) is introduced which models sentence production as a doubly stochastic process, in which word classes are first produced according to a first order Markov model, and then single or multi-character words are generated independently based on the word classes, without word boundary marked on the sentence. This model can be applied to languages without word boundary markers such as Chinese. With a lexicon containing syntactic classes for each word, its applications include language modeling for recognizers, and integrated word segmentation and class tagging. Pre-segmented and tagged corpus are not needed for training, and both segmentation and tagging are trained in one single model. In this paper, relevant algorithms for this model are presented, and experimental results on a Chinese news corpus are reported. | en_HK |
dc.format.extent | 396352 bytes | - |
dc.format.extent | 3669 bytes | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | text/plain | - |
dc.language | eng | en_HK |
dc.publisher | IEEE. | en_HK |
dc.rights | ©1996 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. | - |
dc.subject | Engineering | en_HK |
dc.subject | Electrical engineering | en_HK |
dc.title | Ergodic multigram HMM integrating word segmentation and classtagging for Chinese language modeling | en_HK |
dc.type | Conference_Paper | en_HK |
dc.identifier.openurl | http://library.hku.hk:4550/resserv?sid=HKU:IR&issn=1520-6149&volume=1&spage=196&epage=199&date=1996&atitle=Ergodic+multigram+HMM+integrating+word+segmentation+and+classtagging+for+Chinese+language+modeling | en_HK |
dc.description.nature | published_or_final_version | en_HK |
dc.identifier.doi | 10.1109/ICASSP.1996.540324 | en_HK |
dc.identifier.hkuros | 10594 | - |
dc.identifier.issnl | 1520-6149 | - |