Perception of synthesized voice quality in connected speech by Cantonese speakers

Yiu, EML; Murdoch, B; Hird, K; Lau, P

File Download

122027.pdf

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1121/1.1500753
Scopus: eid_2-s2.0-0036711748
PMID: 12243157
WOS: WOS:000177996400032
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
- PubMed Central: 0
Appears in Collections:
- Division of Speech & Hearing Sciences: Journal/Magazine Articles

Article: Perception of synthesized voice quality in connected speech by Cantonese speakers

Title	Perception of synthesized voice quality in connected speech by Cantonese speakers
Authors	Yiu, EML Murdoch, B Hird, K Lau, P
Keywords	Physics Sound
Issue Date	2002
Publisher	Acoustical Society of America. The Journal's web site is located at http://asa.aip.org/jasa.html
Citation	Journal of the Acoustical Society of America, 2002, v. 112 n. 3, p. 1091-1101 How to Cite? DOI: http://dx.doi.org/10.1121/1.1500753
Abstract	Perceptual voice analysis is a subjective process. However, despite reports of varying degrees of intrajudge and interjudge reliability, it is widely used in clinical voice evaluation. One of the ways to improve the reliability of this procedure is to provide judges with signals as external standards so that comparison can be made in relation to these "anchor" signals. The present study used a Klatt speech synthesizer to create a set of speech signals with varying degree of three different voice qualities based on a Cantonese sentence. The primary objective of the study was to determine whether different abnormal voice qualities could be synthesized using the "built-in" synthesis parameters using a perceptual study. The second objective was to determine the relationship between acoustic characteristics of the synthesized signals and perceptual judgment. Twenty Cantonese-speaking speech pathologists with at least three years of clinical experience in perceptual voice evaluation were asked to undertake two tasks. The first was to decide whether the voice quality of the synthesized signals was normal or not. The second was to decide whether the abnormal signals should be described as rough, breathy, or vocal fry. The results showed that signals generated with a small degree of aspiration noise were perceived as breathiness while signals with a small degree of flutter or double pulsing were perceived as roughness. When the flutter or double pulsing increased further, tremor and vocal fry, rather than roughness, were perceived. Furthermore, the amount of aspiration noise, flutter, or double pulsing required for male voice stimuli was different from that required for the female voice stimuli with a similar level of perceptual breathiness and roughness. These findings showed that changes in perceived vocal quality could be achieved by systematic modifications of synthesis parameters. This opens up the possibility of using synthesized voice signals as external standards or "anchors" to improve the reliability of clinical perceptual voice evaluation. © 2002 Acoustical Society of America.
Persistent Identifier	http://hdl.handle.net/10722/45331
ISSN	0001-4966 2023 Impact Factor: 2.1 2023 SCImago Journal Rankings: 0.687
ISI Accession Number ID	WOS:000177996400032
References	References in Scopus

DC Field	Value	Language
dc.contributor.author	Yiu, EML	en_HK
dc.contributor.author	Murdoch, B	en_HK
dc.contributor.author	Hird, K	en_HK
dc.contributor.author	Lau, P	en_HK
dc.date.accessioned	2007-10-30T06:23:02Z	-
dc.date.available	2007-10-30T06:23:02Z	-
dc.date.issued	2002	en_HK
dc.identifier.citation	Journal of the Acoustical Society of America, 2002, v. 112 n. 3, p. 1091-1101	-
dc.identifier.issn	0001-4966	en_HK
dc.identifier.uri	http://hdl.handle.net/10722/45331	-
dc.description.abstract	Perceptual voice analysis is a subjective process. However, despite reports of varying degrees of intrajudge and interjudge reliability, it is widely used in clinical voice evaluation. One of the ways to improve the reliability of this procedure is to provide judges with signals as external standards so that comparison can be made in relation to these "anchor" signals. The present study used a Klatt speech synthesizer to create a set of speech signals with varying degree of three different voice qualities based on a Cantonese sentence. The primary objective of the study was to determine whether different abnormal voice qualities could be synthesized using the "built-in" synthesis parameters using a perceptual study. The second objective was to determine the relationship between acoustic characteristics of the synthesized signals and perceptual judgment. Twenty Cantonese-speaking speech pathologists with at least three years of clinical experience in perceptual voice evaluation were asked to undertake two tasks. The first was to decide whether the voice quality of the synthesized signals was normal or not. The second was to decide whether the abnormal signals should be described as rough, breathy, or vocal fry. The results showed that signals generated with a small degree of aspiration noise were perceived as breathiness while signals with a small degree of flutter or double pulsing were perceived as roughness. When the flutter or double pulsing increased further, tremor and vocal fry, rather than roughness, were perceived. Furthermore, the amount of aspiration noise, flutter, or double pulsing required for male voice stimuli was different from that required for the female voice stimuli with a similar level of perceptual breathiness and roughness. These findings showed that changes in perceived vocal quality could be achieved by systematic modifications of synthesis parameters. This opens up the possibility of using synthesized voice signals as external standards or "anchors" to improve the reliability of clinical perceptual voice evaluation. © 2002 Acoustical Society of America.	en_HK
dc.format.extent	170240 bytes	-
dc.format.extent	2411 bytes	-
dc.format.mimetype	application/pdf	-
dc.format.mimetype	text/plain	-
dc.language	eng	en_HK
dc.publisher	Acoustical Society of America. The Journal's web site is located at http://asa.aip.org/jasa.html	en_HK
dc.relation.ispartof	Journal of the Acoustical Society of America	en_HK
dc.rights	Copyright 2002 Acoustical Society of America. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the Acoustical Society of America. The following article appeared in Journal of the Acoustical Society of America, 2002, v. 112 n. 3, p. 1091-1101 and may be found at https://doi.org/10.1121/1.1500753	-
dc.subject	Physics	en_HK
dc.subject	Sound	en_HK
dc.title	Perception of synthesized voice quality in connected speech by Cantonese speakers	en_HK
dc.type	Article	en_HK
dc.identifier.openurl	http://library.hku.hk:4550/resserv?sid=HKU:IR&issn=0001-4966&volume=112&issue=3&spage=1091&epage=1101&date=2002&atitle=Perception+of+synthesized+voice+quality+in+connected+speech+by+Cantonese+speakers	en_HK
dc.identifier.email	Yiu, EML: eyiu@hku.hk	en_HK
dc.identifier.authority	Yiu, EML=rp00981	en_HK
dc.description.nature	published_or_final_version	en_HK
dc.identifier.doi	10.1121/1.1500753	en_HK
dc.identifier.pmid	12243157	en_HK
dc.identifier.scopus	eid_2-s2.0-0036711748	en_HK
dc.identifier.hkuros	78478	-
dc.relation.references	http://www.scopus.com/mlt/select.url?eid=2-s2.0-0036711748&selection=ref&src=s&origin=recordpage	en_HK
dc.identifier.volume	112	en_HK
dc.identifier.issue	3	-
dc.identifier.spage	1091	en_HK
dc.identifier.epage	1101	en_HK
dc.identifier.isi	WOS:000177996400032	-
dc.publisher.place	United States	en_HK
dc.identifier.scopusauthorid	Yiu, EML=7003337895	en_HK
dc.identifier.scopusauthorid	Murdoch, B=7005161745	en_HK
dc.identifier.scopusauthorid	Hird, K=6701518192	en_HK
dc.identifier.scopusauthorid	Lau, P=23768195700	en_HK
dc.identifier.issnl	0001-4966	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Perception of synthesized voice quality in connected speech by Cantonese speakers

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats