Pretraining the noisy channel model for task-oriented dialogue

Liu, Qi; Yu, Lei; Rimell, Laura; Blunsom, Phil

File Download

content.pdf

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1162/tacl_a_00390
Scopus: eid_2-s2.0-85117644184
WOS: WOS:000751952200040

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Computer Science: Journal/Magazine Articles

Article: Pretraining the noisy channel model for task-oriented dialogue

Title	Pretraining the noisy channel model for task-oriented dialogue
Authors	Liu, Qi Yu, Lei Rimell, Laura Blunsom, Phil
Issue Date	2021
Citation	Transactions of the Association for Computational Linguistics, 2021, v. 9, p. 657-674 How to Cite? DOI: http://dx.doi.org/10.1162/tacl_a_00390
Abstract	Direct decoding for task-oriented dialogue is known to suffer from the explaining-away effect, manifested in models that prefer short and generic responses. Here we argue for the use of Bayes’ theorem to factorize the dialogue task into two models, the distribution of the context given the response, and the prior for the response itself. This approach, an instan-tiation of the noisy channel model, both mitigates the explaining-away effect and allows the principled incorporation of large pretrained models for the response prior. We present extensive experiments showing that a noisy channel model decodes better responses compared to direct decoding and that a two-stage pre-training strategy, employing both open-domain and task-oriented dialogue data, improves over randomly initialized models.
Persistent Identifier	http://hdl.handle.net/10722/321967
ISI Accession Number ID	WOS:000751952200040

DC Field	Value	Language
dc.contributor.author	Liu, Qi	-
dc.contributor.author	Yu, Lei	-
dc.contributor.author	Rimell, Laura	-
dc.contributor.author	Blunsom, Phil	-
dc.date.accessioned	2022-11-03T02:22:41Z	-
dc.date.available	2022-11-03T02:22:41Z	-
dc.date.issued	2021	-
dc.identifier.citation	Transactions of the Association for Computational Linguistics, 2021, v. 9, p. 657-674	-
dc.identifier.uri	http://hdl.handle.net/10722/321967	-
dc.description.abstract	Direct decoding for task-oriented dialogue is known to suffer from the explaining-away effect, manifested in models that prefer short and generic responses. Here we argue for the use of Bayes’ theorem to factorize the dialogue task into two models, the distribution of the context given the response, and the prior for the response itself. This approach, an instan-tiation of the noisy channel model, both mitigates the explaining-away effect and allows the principled incorporation of large pretrained models for the response prior. We present extensive experiments showing that a noisy channel model decodes better responses compared to direct decoding and that a two-stage pre-training strategy, employing both open-domain and task-oriented dialogue data, improves over randomly initialized models.	-
dc.language	eng	-
dc.relation.ispartof	Transactions of the Association for Computational Linguistics	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.title	Pretraining the noisy channel model for task-oriented dialogue	-
dc.type	Article	-
dc.description.nature	published_or_final_version	-
dc.identifier.doi	10.1162/tacl_a_00390	-
dc.identifier.scopus	eid_2-s2.0-85117644184	-
dc.identifier.volume	9	-
dc.identifier.spage	657	-
dc.identifier.epage	674	-
dc.identifier.eissn	2307-387X	-
dc.identifier.isi	WOS:000751952200040	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Pretraining the noisy channel model for task-oriented dialogue

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats