File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: End-to-End Neural Segmental Models for Speech Recognition

TitleEnd-to-End Neural Segmental Models for Speech Recognition
Authors
KeywordsConnectionist temporal classification
end-to-end training
segmental models
multitask training
Issue Date2017
Citation
IEEE Journal on Selected Topics in Signal Processing, 2017, v. 11, n. 8, p. 1254-1264 How to Cite?
AbstractSegmental models are an alternative to frame-based models for sequence prediction, where hypothesized path weights are based on entire segment scores rather than a single frame at a time. Neural segmental models are segmental models that use neural network-based weight functions. Neural segmental models have achieved competitive results for speech recognition, and their end-to-end training has been explored in several studies. In this work, we review neural segmental models, which can be viewed as consisting of a neural network-based acoustic encoder and a finite-state transducer decoder. We study end-to-end segmental models with different weight functions, including ones based on frame-level neural classifiers and on segmental recurrent neural networks. We study how reducing the search space size impacts performance under different weight functions. We also compare several loss functions for end-to-end training. Finally, we explore training approaches, including multistage versus end-to-end training and multitask training that combines segmental and frame-level losses.
Persistent Identifierhttp://hdl.handle.net/10722/296158
ISSN
2023 Impact Factor: 8.7
2023 SCImago Journal Rankings: 3.818
ISI Accession Number ID

 

DC FieldValueLanguage
dc.contributor.authorTang, Hao-
dc.contributor.authorLu, Liang-
dc.contributor.authorKong, Lingpeng-
dc.contributor.authorGimpel, Kevin-
dc.contributor.authorLivescu, Karen-
dc.contributor.authorDyer, Chris-
dc.contributor.authorSmith, Noah A.-
dc.contributor.authorRenals, Steve-
dc.date.accessioned2021-02-11T04:52:57Z-
dc.date.available2021-02-11T04:52:57Z-
dc.date.issued2017-
dc.identifier.citationIEEE Journal on Selected Topics in Signal Processing, 2017, v. 11, n. 8, p. 1254-1264-
dc.identifier.issn1932-4553-
dc.identifier.urihttp://hdl.handle.net/10722/296158-
dc.description.abstractSegmental models are an alternative to frame-based models for sequence prediction, where hypothesized path weights are based on entire segment scores rather than a single frame at a time. Neural segmental models are segmental models that use neural network-based weight functions. Neural segmental models have achieved competitive results for speech recognition, and their end-to-end training has been explored in several studies. In this work, we review neural segmental models, which can be viewed as consisting of a neural network-based acoustic encoder and a finite-state transducer decoder. We study end-to-end segmental models with different weight functions, including ones based on frame-level neural classifiers and on segmental recurrent neural networks. We study how reducing the search space size impacts performance under different weight functions. We also compare several loss functions for end-to-end training. Finally, we explore training approaches, including multistage versus end-to-end training and multitask training that combines segmental and frame-level losses.-
dc.languageeng-
dc.relation.ispartofIEEE Journal on Selected Topics in Signal Processing-
dc.subjectConnectionist temporal classification-
dc.subjectend-to-end training-
dc.subjectsegmental models-
dc.subjectmultitask training-
dc.titleEnd-to-End Neural Segmental Models for Speech Recognition-
dc.typeArticle-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1109/JSTSP.2017.2752462-
dc.identifier.scopuseid_2-s2.0-85030313902-
dc.identifier.volume11-
dc.identifier.issue8-
dc.identifier.spage1254-
dc.identifier.epage1264-
dc.identifier.isiWOS:000416226000003-
dc.identifier.issnl1932-4553-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats