File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: A continuous Putonghua recognizer

TitleA continuous Putonghua recognizer
Authors
Issue Date1997
PublisherIEEE.
Citation
The 13th International Conference on Digital Signal Processing, Santorini, Greece, 2-4 July 1997, v. 2, p. 889-892 How to Cite?
AbstractA multi-speaker continuous Putonghua recognizer has been developed composing of 20 speaker-dependent recognizer as sub-systems. Each sub-system is a network of hidden Markov models modeling triphones as the fundamental speech units. Over 3 GB of speech data have been collected for training from twenty native Putonghua speakers reading carefully designed tests trying to include all phone-to-phone transitions in Putonghua. A Viterbi path search yields the best speech unit sequence over the HMMnet for each unknown input utterance which is then passed down to a language model for post-processing. The most suitable word sequence is determined by means of the bigram statistics of 470 word classes covering a vocabulary of over 80,000 words. An enrollment process is required for each new user to select the most suitable speaker-dependent system among the 20 sub-systems according to their recognition performance on a small quantity of speech data collected from the user.
Persistent Identifierhttp://hdl.handle.net/10722/45600
ISBN

 

DC FieldValueLanguage
dc.contributor.authorWong, PKen_HK
dc.contributor.authorChan, Cen_HK
dc.date.accessioned2007-10-30T06:30:02Z-
dc.date.available2007-10-30T06:30:02Z-
dc.date.issued1997en_HK
dc.identifier.citationThe 13th International Conference on Digital Signal Processing, Santorini, Greece, 2-4 July 1997, v. 2, p. 889-892en_HK
dc.identifier.isbn0-7803-4137-6en_HK
dc.identifier.urihttp://hdl.handle.net/10722/45600-
dc.description.abstractA multi-speaker continuous Putonghua recognizer has been developed composing of 20 speaker-dependent recognizer as sub-systems. Each sub-system is a network of hidden Markov models modeling triphones as the fundamental speech units. Over 3 GB of speech data have been collected for training from twenty native Putonghua speakers reading carefully designed tests trying to include all phone-to-phone transitions in Putonghua. A Viterbi path search yields the best speech unit sequence over the HMMnet for each unknown input utterance which is then passed down to a language model for post-processing. The most suitable word sequence is determined by means of the bigram statistics of 470 word classes covering a vocabulary of over 80,000 words. An enrollment process is required for each new user to select the most suitable speaker-dependent system among the 20 sub-systems according to their recognition performance on a small quantity of speech data collected from the user.en_HK
dc.format.extent437347 bytes-
dc.format.extent3669 bytes-
dc.format.mimetypeapplication/pdf-
dc.format.mimetypetext/plain-
dc.languageengen_HK
dc.publisherIEEE.en_HK
dc.rights©1997 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.en_HK
dc.rightsCreative Commons: Attribution 3.0 Hong Kong License-
dc.titleA continuous Putonghua recognizeren_HK
dc.typeConference_Paperen_HK
dc.identifier.openurlhttp://library.hku.hk:4550/resserv?sid=HKU:IR&issn=0-7803-4137-6&volume=2&spage=889&epage=892&date=1997&atitle=A+continuous+Putonghua+recognizeren_HK
dc.description.naturepublished_or_final_versionen_HK
dc.identifier.doi10.1109/ICDSP.1997.628503en_HK
dc.identifier.hkuros38209-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats