File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Scopus: eid_2-s2.0-0022888608
- WOS: WOS:A1986F104700004
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: Analysis and recognition of isolated putonghua vowels by Karhunen-Loève transformation techniques
Title | Analysis and recognition of isolated putonghua vowels by Karhunen-Loève transformation techniques |
---|---|
Authors | |
Keywords | Chinese Phonetic Alphabets (CPA) closed test critical-band filtering Euclidean distance final initial Karhunen-Loève transformation (KLT) Mahalanobis distance modified Mahalanobis distance open test Putonghua semi-open test sequential open test tone |
Issue Date | 1986 |
Publisher | Elsevier BV. The Journal's web site is located at http://www.elsevier.com/locate/specom |
Citation | Speech Communication, 1986, v. 5 n. 3-4, p. 299-330 How to Cite? |
Abstract | A systematic study on a speaker-independent vowel recognition model has been performed. Karhunen-Loève Transformation (KLT), or Principal Component Analysis, technique was applied subsequent to a spectral analysis of the speech signal by 18 non-overlapping critical-band filters. Four experiments have been conducted using selected segments of 8 isolated Putonghua (Mandarin) vowels, spoken twice in 5 tones by 38 females and 13 males. The first experiment uses the same speech sample in training and testing to evaluate the effects of KLT, speaker normalization, distance metric and number of vowel classes. A modified Mahalanobis distance coupled with a 7-class condition was found to give the best performance. In the next experiment, one sample was used to train the model, and another trial of the same speech, spoken by the same group of speakers, was used to test it. It was found that, in general, a sex-specific and tone-specific procedure could be avoided without significant loss in performance. The third experiment repeatedly trained the model with 50 speakers and tested it wiht the remaining one until all 51 speakers had been tested. Under this stringent condition, an average recognition rate of 88.2% was achieved using only 4 classificatory dimensions. In the last experiment, all segments of a vowel were labelled using the most stringent conditions. The model was confirmed to perform well for one male and one female speaker selected at random. Also, the vowel that had caused the greatest confusion was found to be well recognized when treated as an allophone of another vowel. Finally, the possibility of extending the present technique to diphthong recognition is discussed together with some preliminary results. © 1986. |
Persistent Identifier | http://hdl.handle.net/10722/147778 |
ISSN | 2023 Impact Factor: 2.4 2023 SCImago Journal Rankings: 0.769 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Chan, LCM | en_HK |
dc.contributor.author | Cheung, YS | en_HK |
dc.date.accessioned | 2012-05-29T06:09:10Z | - |
dc.date.available | 2012-05-29T06:09:10Z | - |
dc.date.issued | 1986 | en_HK |
dc.identifier.citation | Speech Communication, 1986, v. 5 n. 3-4, p. 299-330 | en_HK |
dc.identifier.issn | 0167-6393 | en_HK |
dc.identifier.uri | http://hdl.handle.net/10722/147778 | - |
dc.description.abstract | A systematic study on a speaker-independent vowel recognition model has been performed. Karhunen-Loève Transformation (KLT), or Principal Component Analysis, technique was applied subsequent to a spectral analysis of the speech signal by 18 non-overlapping critical-band filters. Four experiments have been conducted using selected segments of 8 isolated Putonghua (Mandarin) vowels, spoken twice in 5 tones by 38 females and 13 males. The first experiment uses the same speech sample in training and testing to evaluate the effects of KLT, speaker normalization, distance metric and number of vowel classes. A modified Mahalanobis distance coupled with a 7-class condition was found to give the best performance. In the next experiment, one sample was used to train the model, and another trial of the same speech, spoken by the same group of speakers, was used to test it. It was found that, in general, a sex-specific and tone-specific procedure could be avoided without significant loss in performance. The third experiment repeatedly trained the model with 50 speakers and tested it wiht the remaining one until all 51 speakers had been tested. Under this stringent condition, an average recognition rate of 88.2% was achieved using only 4 classificatory dimensions. In the last experiment, all segments of a vowel were labelled using the most stringent conditions. The model was confirmed to perform well for one male and one female speaker selected at random. Also, the vowel that had caused the greatest confusion was found to be well recognized when treated as an allophone of another vowel. Finally, the possibility of extending the present technique to diphthong recognition is discussed together with some preliminary results. © 1986. | en_HK |
dc.language | eng | en_US |
dc.publisher | Elsevier BV. The Journal's web site is located at http://www.elsevier.com/locate/specom | en_HK |
dc.relation.ispartof | Speech Communication | en_HK |
dc.subject | Chinese Phonetic Alphabets (CPA) | en_HK |
dc.subject | closed test | en_HK |
dc.subject | critical-band filtering | en_HK |
dc.subject | Euclidean distance | en_HK |
dc.subject | final | en_HK |
dc.subject | initial | en_HK |
dc.subject | Karhunen-Loève transformation (KLT) | en_HK |
dc.subject | Mahalanobis distance | en_HK |
dc.subject | modified Mahalanobis distance | en_HK |
dc.subject | open test | en_HK |
dc.subject | Putonghua | en_HK |
dc.subject | semi-open test | en_HK |
dc.subject | sequential open test | en_HK |
dc.subject | tone | en_HK |
dc.title | Analysis and recognition of isolated putonghua vowels by Karhunen-Loève transformation techniques | en_HK |
dc.type | Article | en_HK |
dc.identifier.email | Cheung, YS:paul.cheung@hku.hk | en_HK |
dc.identifier.authority | Cheung, YS=rp00077 | en_HK |
dc.description.nature | link_to_subscribed_fulltext | en_US |
dc.identifier.scopus | eid_2-s2.0-0022888608 | en_HK |
dc.identifier.volume | 5 | en_HK |
dc.identifier.issue | 3-4 | en_HK |
dc.identifier.spage | 299 | en_HK |
dc.identifier.epage | 330 | en_HK |
dc.identifier.isi | WOS:A1986F104700004 | - |
dc.publisher.place | Netherlands | en_HK |
dc.identifier.scopusauthorid | Chan, LCM=55223130400 | en_HK |
dc.identifier.scopusauthorid | Cheung, YS=7202595335 | en_HK |
dc.identifier.issnl | 0167-6393 | - |