Knowledge transfer for improved neural machine translation

Chen, Yun; 陈云

File Download

FullText.pdf

Links for fulltext

(May Require Subscription)

DOI: 10.5353/th_991044058178103414

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Electrical & Electronic Engineering: Theses

postgraduate thesis: Knowledge transfer for improved neural machine translation

Title	Knowledge transfer for improved neural machine translation
Authors	Chen, Yun 陈云
Advisors	Advisor(s):Li, VOK
Issue Date	2018
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Chen, Y. [陈云]. (2018). Knowledge transfer for improved neural machine translation. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Being able to communicate seamlessly across human languages has long been associted with the general success of artificial intelligence. Despite great success in the field of neural machine translation (NMT) since its invention in 2014 (Sutskever et al., 2014; Bahdanau et al., 2015), translation quality and speed have not yet satisfied users, especially for resource-scarce language pairs and on-device real-time applications (Koehn and Knowles, 2017). The standard NMT builds a translation model from source to target with maximum likelihood estimation (MLE) on parallel source-target corpora and then use approximate decoding algorithms such as greedy decoding or beam search to translate source sentences at inference time. The training and decoding procedures are independent without the interaction with other NMT models. This thesis proposes new training and decoding strategies by interaction with other NMT models through knowledge transfer to improve neural machine translation, including the following topics: • Improving zero-resource neural machine translation (ZNMT): We propose two pivot-based approaches to tackle this problem: a) a teacher-student framework which transfers the knowledge from a high-resource model (teacher) to a zero-resource model (student) by training the student under the supervision of the teacher; b) a multi-agent communication game in which the zero-resource model learns by playing the game with the high-resource model. • Improving decoding efficiency for NMT: We propose a novel training strategy for training an actor-augmented decoder to optimize greedy decoding by transferring knowledge of a trained translation model. • Improving training strategy for NMT: We propose Born Again Networks (BANs) for training NMT. In a manner reminiscent to Minsky’s Sequence of Teaching Selves (Minsky, 1991), we train a sequence of models of identical capacity with improved performance by transferring the knowledge from its previous model. We compare our approaches with the state-of-the-art NMT systems on different datasets, such as IWSLT, Europarl and WMT, and language pairs, such as German-English, Finnish-English, French-English, and Spanish-English. Extensive experiments demonstrate that all of our proposed methods can achieve their individual goals by effectively transferring the knowledge from auxiliary models or tasks to the target NMT model. (341 words)
Degree	Doctor of Philosophy
Subject	Machine translating
Dept/Program	Electrical and Electronic Engineering
Persistent Identifier	http://hdl.handle.net/10722/265389

DC Field	Value	Language
dc.contributor.advisor	Li, VOK	-
dc.contributor.author	Chen, Yun	-
dc.contributor.author	陈云	-
dc.date.accessioned	2018-11-29T06:22:32Z	-
dc.date.available	2018-11-29T06:22:32Z	-
dc.date.issued	2018	-
dc.identifier.citation	Chen, Y. [陈云]. (2018). Knowledge transfer for improved neural machine translation. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/265389	-
dc.description.abstract	Being able to communicate seamlessly across human languages has long been associted with the general success of artificial intelligence. Despite great success in the field of neural machine translation (NMT) since its invention in 2014 (Sutskever et al., 2014; Bahdanau et al., 2015), translation quality and speed have not yet satisfied users, especially for resource-scarce language pairs and on-device real-time applications (Koehn and Knowles, 2017). The standard NMT builds a translation model from source to target with maximum likelihood estimation (MLE) on parallel source-target corpora and then use approximate decoding algorithms such as greedy decoding or beam search to translate source sentences at inference time. The training and decoding procedures are independent without the interaction with other NMT models. This thesis proposes new training and decoding strategies by interaction with other NMT models through knowledge transfer to improve neural machine translation, including the following topics: • Improving zero-resource neural machine translation (ZNMT): We propose two pivot-based approaches to tackle this problem: a) a teacher-student framework which transfers the knowledge from a high-resource model (teacher) to a zero-resource model (student) by training the student under the supervision of the teacher; b) a multi-agent communication game in which the zero-resource model learns by playing the game with the high-resource model. • Improving decoding efficiency for NMT: We propose a novel training strategy for training an actor-augmented decoder to optimize greedy decoding by transferring knowledge of a trained translation model. • Improving training strategy for NMT: We propose Born Again Networks (BANs) for training NMT. In a manner reminiscent to Minsky’s Sequence of Teaching Selves (Minsky, 1991), we train a sequence of models of identical capacity with improved performance by transferring the knowledge from its previous model. We compare our approaches with the state-of-the-art NMT systems on different datasets, such as IWSLT, Europarl and WMT, and language pairs, such as German-English, Finnish-English, French-English, and Spanish-English. Extensive experiments demonstrate that all of our proposed methods can achieve their individual goals by effectively transferring the knowledge from auxiliary models or tasks to the target NMT model. (341 words)	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Machine translating	-
dc.title	Knowledge transfer for improved neural machine translation	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Electrical and Electronic Engineering	-
dc.description.nature	published_or_final_version	-
dc.identifier.doi	10.5353/th_991044058178103414	-
dc.date.hkucongregation	2018	-
dc.identifier.mmsid	991044058178103414	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

postgraduate thesis: Knowledge transfer for improved neural machine translation

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats