Analysis and recognition of isolated putonghua vowels by Karhunen-Loève transformation techniques

Chan, LCM; Cheung, YS

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Scopus: eid_2-s2.0-0022888608
WOS: WOS:A1986F104700004
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Pathology: Journal/Magazine Articles

Article: Analysis and recognition of isolated putonghua vowels by Karhunen-Loève transformation techniques

Title	Analysis and recognition of isolated putonghua vowels by Karhunen-Loève transformation techniques
Authors	Chan, LCM Cheung, YS
Keywords	Chinese Phonetic Alphabets (CPA) closed test critical-band filtering Euclidean distance final initial Karhunen-Loève transformation (KLT) Mahalanobis distance modified Mahalanobis distance open test Putonghua semi-open test sequential open test tone
Issue Date	1986
Publisher	Elsevier BV. The Journal's web site is located at http://www.elsevier.com/locate/specom
Citation	Speech Communication, 1986, v. 5 n. 3-4, p. 299-330 How to Cite?
Abstract	A systematic study on a speaker-independent vowel recognition model has been performed. Karhunen-Loève Transformation (KLT), or Principal Component Analysis, technique was applied subsequent to a spectral analysis of the speech signal by 18 non-overlapping critical-band filters. Four experiments have been conducted using selected segments of 8 isolated Putonghua (Mandarin) vowels, spoken twice in 5 tones by 38 females and 13 males. The first experiment uses the same speech sample in training and testing to evaluate the effects of KLT, speaker normalization, distance metric and number of vowel classes. A modified Mahalanobis distance coupled with a 7-class condition was found to give the best performance. In the next experiment, one sample was used to train the model, and another trial of the same speech, spoken by the same group of speakers, was used to test it. It was found that, in general, a sex-specific and tone-specific procedure could be avoided without significant loss in performance. The third experiment repeatedly trained the model with 50 speakers and tested it wiht the remaining one until all 51 speakers had been tested. Under this stringent condition, an average recognition rate of 88.2% was achieved using only 4 classificatory dimensions. In the last experiment, all segments of a vowel were labelled using the most stringent conditions. The model was confirmed to perform well for one male and one female speaker selected at random. Also, the vowel that had caused the greatest confusion was found to be well recognized when treated as an allophone of another vowel. Finally, the possibility of extending the present technique to diphthong recognition is discussed together with some preliminary results. © 1986.
Persistent Identifier	http://hdl.handle.net/10722/147778
ISSN	0167-6393 2023 Impact Factor: 2.4 2023 SCImago Journal Rankings: 0.769
ISI Accession Number ID	WOS:A1986F104700004

DC Field	Value	Language
dc.contributor.author	Chan, LCM	en_HK
dc.contributor.author	Cheung, YS	en_HK
dc.date.accessioned	2012-05-29T06:09:10Z	-
dc.date.available	2012-05-29T06:09:10Z	-
dc.date.issued	1986	en_HK
dc.identifier.citation	Speech Communication, 1986, v. 5 n. 3-4, p. 299-330	en_HK
dc.identifier.issn	0167-6393	en_HK
dc.identifier.uri	http://hdl.handle.net/10722/147778	-
dc.description.abstract	A systematic study on a speaker-independent vowel recognition model has been performed. Karhunen-Loève Transformation (KLT), or Principal Component Analysis, technique was applied subsequent to a spectral analysis of the speech signal by 18 non-overlapping critical-band filters. Four experiments have been conducted using selected segments of 8 isolated Putonghua (Mandarin) vowels, spoken twice in 5 tones by 38 females and 13 males. The first experiment uses the same speech sample in training and testing to evaluate the effects of KLT, speaker normalization, distance metric and number of vowel classes. A modified Mahalanobis distance coupled with a 7-class condition was found to give the best performance. In the next experiment, one sample was used to train the model, and another trial of the same speech, spoken by the same group of speakers, was used to test it. It was found that, in general, a sex-specific and tone-specific procedure could be avoided without significant loss in performance. The third experiment repeatedly trained the model with 50 speakers and tested it wiht the remaining one until all 51 speakers had been tested. Under this stringent condition, an average recognition rate of 88.2% was achieved using only 4 classificatory dimensions. In the last experiment, all segments of a vowel were labelled using the most stringent conditions. The model was confirmed to perform well for one male and one female speaker selected at random. Also, the vowel that had caused the greatest confusion was found to be well recognized when treated as an allophone of another vowel. Finally, the possibility of extending the present technique to diphthong recognition is discussed together with some preliminary results. © 1986.	en_HK
dc.language	eng	en_US
dc.publisher	Elsevier BV. The Journal's web site is located at http://www.elsevier.com/locate/specom	en_HK
dc.relation.ispartof	Speech Communication	en_HK
dc.subject	Chinese Phonetic Alphabets (CPA)	en_HK
dc.subject	closed test	en_HK
dc.subject	critical-band filtering	en_HK
dc.subject	Euclidean distance	en_HK
dc.subject	final	en_HK
dc.subject	initial	en_HK
dc.subject	Karhunen-Loève transformation (KLT)	en_HK
dc.subject	Mahalanobis distance	en_HK
dc.subject	modified Mahalanobis distance	en_HK
dc.subject	open test	en_HK
dc.subject	Putonghua	en_HK
dc.subject	semi-open test	en_HK
dc.subject	sequential open test	en_HK
dc.subject	tone	en_HK
dc.title	Analysis and recognition of isolated putonghua vowels by Karhunen-Loève transformation techniques	en_HK
dc.type	Article	en_HK
dc.identifier.email	Cheung, YS:paul.cheung@hku.hk	en_HK
dc.identifier.authority	Cheung, YS=rp00077	en_HK
dc.description.nature	link_to_subscribed_fulltext	en_US
dc.identifier.scopus	eid_2-s2.0-0022888608	en_HK
dc.identifier.volume	5	en_HK
dc.identifier.issue	3-4	en_HK
dc.identifier.spage	299	en_HK
dc.identifier.epage	330	en_HK
dc.identifier.isi	WOS:A1986F104700004	-
dc.publisher.place	Netherlands	en_HK
dc.identifier.scopusauthorid	Chan, LCM=55223130400	en_HK
dc.identifier.scopusauthorid	Cheung, YS=7202595335	en_HK
dc.identifier.issnl	0167-6393	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Analysis and recognition of isolated putonghua vowels by Karhunen-Loève transformation techniques

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats