File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Deep Neural Networks for Voice Quality Assessment Based on the GRBAS Scale

TitleDeep Neural Networks for Voice Quality Assessment Based on the GRBAS Scale
Authors
Issue Date2016
PublisherInternational Speech Communication Association (ISCA).
Citation
Proceedings of the 17th INTERSPEECH conference 2016 , San Francisco, USA, 8-12 September 2016, p. 2656-2660 How to Cite?
AbstractIn the field of voice therapy, perceptual evaluation is widely used by expert listeners as a way to evaluate pathological and normal voice quality. This approach is understandably subjective as it is subject to listeners’ bias which high inter- and intra-listeners variability can be found. As such, research on automatic assessment of pathological voices using a combination of subjective and objective analyses emerged. The present study aimed to develop a complementary automatic assessment system for voice quality based on the well-known GRBAS scale by using a battery of multidimensional acoustical measures through Deep Neural Networks. A total of 44 dimensionality parameters including Mel-frequency Cepstral Coefficients, Smoothed Cepstral Peak Prominence and Long-Term Average Spectrum was adopted. In addition, the state-of-the-art automatic assessment system based on Modulation Spectrum (MS) features and GMM classifiers was used as comparison system. The classification results using the proposed method revealed a moderate correlation with subjective GRBAS scores of dysphonic severity, and yielded a better performance than MS-GMM system, with the best accuracy around 81.53%. The findings indicate that such assessment system can be used as an appropriate evaluation tool in determining the presence and severity of voice disorders.
DescriptionPoster Presentation - Session: Learning, Education and Different Speech - no. Sun-P-7-3-3, paper ID 986
Persistent Identifierhttp://hdl.handle.net/10722/260889
ISSN
ISI Accession Number ID

 

DC FieldValueLanguage
dc.contributor.authorXie, S-
dc.contributor.authorYan, N-
dc.contributor.authorYu, P-
dc.contributor.authorNg, ML-
dc.contributor.authorWang, L-
dc.contributor.authorJi, Z-
dc.date.accessioned2018-09-14T08:49:04Z-
dc.date.available2018-09-14T08:49:04Z-
dc.date.issued2016-
dc.identifier.citationProceedings of the 17th INTERSPEECH conference 2016 , San Francisco, USA, 8-12 September 2016, p. 2656-2660-
dc.identifier.issn1990-9772-
dc.identifier.urihttp://hdl.handle.net/10722/260889-
dc.descriptionPoster Presentation - Session: Learning, Education and Different Speech - no. Sun-P-7-3-3, paper ID 986-
dc.description.abstractIn the field of voice therapy, perceptual evaluation is widely used by expert listeners as a way to evaluate pathological and normal voice quality. This approach is understandably subjective as it is subject to listeners’ bias which high inter- and intra-listeners variability can be found. As such, research on automatic assessment of pathological voices using a combination of subjective and objective analyses emerged. The present study aimed to develop a complementary automatic assessment system for voice quality based on the well-known GRBAS scale by using a battery of multidimensional acoustical measures through Deep Neural Networks. A total of 44 dimensionality parameters including Mel-frequency Cepstral Coefficients, Smoothed Cepstral Peak Prominence and Long-Term Average Spectrum was adopted. In addition, the state-of-the-art automatic assessment system based on Modulation Spectrum (MS) features and GMM classifiers was used as comparison system. The classification results using the proposed method revealed a moderate correlation with subjective GRBAS scores of dysphonic severity, and yielded a better performance than MS-GMM system, with the best accuracy around 81.53%. The findings indicate that such assessment system can be used as an appropriate evaluation tool in determining the presence and severity of voice disorders. -
dc.languageeng-
dc.publisherInternational Speech Communication Association (ISCA). -
dc.relation.ispartofInterspeech Conference Proceedings-
dc.titleDeep Neural Networks for Voice Quality Assessment Based on the GRBAS Scale-
dc.typeConference_Paper-
dc.identifier.emailNg, ML: manwa@hku.hk-
dc.identifier.authorityNg, ML=rp00942-
dc.identifier.doi10.21437/Interspeech.2016-986-
dc.identifier.hkuros290499-
dc.identifier.spage2656-
dc.identifier.epage2660-
dc.identifier.isiWOS:000409394401236-
dc.publisher.placeUnited States-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats