File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1016/S0167-6393(01)00011-5
- Scopus: eid_2-s2.0-0036642777
- WOS: WOS:000176314600003
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: Use of voicing features in HMM-based speech recognition
Title | Use of voicing features in HMM-based speech recognition |
---|---|
Authors | |
Keywords | Autocorrelation Function Cepstral Mean Subtraction Discriminative Training Hidden Markov Models Hierarchical Signal Bias Removal Jitter Periodicity Speech Recognition Features Voiced And Unvoiced Speech Voicing |
Issue Date | 2002 |
Publisher | Elsevier BV. The Journal's web site is located at http://www.elsevier.com/locate/specom |
Citation | Speech Communication, 2002, v. 37 n. 3-4, p. 197-211 How to Cite? |
Abstract | We investigate speech recognition features related to voicing functions that indicate whether the vocal folds are vibrating. We describe two voicing features, periodicity and jitter, and demonstrate that they are powerful voicing discriminators. The periodicity and jitter features and their first and second time derivatives are appended to a standard 38-dimensional feature vector comprising the first and second time derivatives of the frame energy and the cepstral coefficients with their first and second time derivatives. HMM-based connected-digit (CD) and large-vocabulary (LV) recognition experiments comparing the traditional and extended feature sets show that voicing features and spectral information are complementary and that improved speech recognition performance is obtained by combining the two sources of information. We further conclude that the difference in performance with and without voicing becomes more significant when minimum string error (MSE) training is used than when maximum likelihood (ML) training is used. © 2002 Elsevier Science B.V. All rights reserved. |
Persistent Identifier | http://hdl.handle.net/10722/178772 |
ISSN | 2023 Impact Factor: 2.4 2023 SCImago Journal Rankings: 0.769 |
ISI Accession Number ID | |
References |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Thomson, DL | en_US |
dc.contributor.author | Chengalvarayan, R | en_US |
dc.date.accessioned | 2012-12-19T09:49:39Z | - |
dc.date.available | 2012-12-19T09:49:39Z | - |
dc.date.issued | 2002 | en_US |
dc.identifier.citation | Speech Communication, 2002, v. 37 n. 3-4, p. 197-211 | en_US |
dc.identifier.issn | 0167-6393 | en_US |
dc.identifier.uri | http://hdl.handle.net/10722/178772 | - |
dc.description.abstract | We investigate speech recognition features related to voicing functions that indicate whether the vocal folds are vibrating. We describe two voicing features, periodicity and jitter, and demonstrate that they are powerful voicing discriminators. The periodicity and jitter features and their first and second time derivatives are appended to a standard 38-dimensional feature vector comprising the first and second time derivatives of the frame energy and the cepstral coefficients with their first and second time derivatives. HMM-based connected-digit (CD) and large-vocabulary (LV) recognition experiments comparing the traditional and extended feature sets show that voicing features and spectral information are complementary and that improved speech recognition performance is obtained by combining the two sources of information. We further conclude that the difference in performance with and without voicing becomes more significant when minimum string error (MSE) training is used than when maximum likelihood (ML) training is used. © 2002 Elsevier Science B.V. All rights reserved. | en_US |
dc.language | eng | en_US |
dc.publisher | Elsevier BV. The Journal's web site is located at http://www.elsevier.com/locate/specom | en_US |
dc.relation.ispartof | Speech Communication | en_US |
dc.subject | Autocorrelation Function | en_US |
dc.subject | Cepstral Mean Subtraction | en_US |
dc.subject | Discriminative Training | en_US |
dc.subject | Hidden Markov Models | en_US |
dc.subject | Hierarchical Signal Bias Removal | en_US |
dc.subject | Jitter | en_US |
dc.subject | Periodicity | en_US |
dc.subject | Speech Recognition Features | en_US |
dc.subject | Voiced And Unvoiced Speech | en_US |
dc.subject | Voicing | en_US |
dc.title | Use of voicing features in HMM-based speech recognition | en_US |
dc.type | Article | en_US |
dc.identifier.email | Thomson, DL: dthomson@hku.hk | en_US |
dc.identifier.authority | Thomson, DL=rp00788 | en_US |
dc.description.nature | link_to_subscribed_fulltext | en_US |
dc.identifier.doi | 10.1016/S0167-6393(01)00011-5 | en_US |
dc.identifier.scopus | eid_2-s2.0-0036642777 | en_US |
dc.relation.references | http://www.scopus.com/mlt/select.url?eid=2-s2.0-0036642777&selection=ref&src=s&origin=recordpage | en_US |
dc.identifier.volume | 37 | en_US |
dc.identifier.issue | 3-4 | en_US |
dc.identifier.spage | 197 | en_US |
dc.identifier.epage | 211 | en_US |
dc.identifier.isi | WOS:000176314600003 | - |
dc.publisher.place | Netherlands | en_US |
dc.identifier.scopusauthorid | Thomson, DL=7202586830 | en_US |
dc.identifier.scopusauthorid | Chengalvarayan, R=6701843465 | en_US |
dc.identifier.issnl | 0167-6393 | - |