Neural machine translation with a unified framework of transferable models

Wang, Yong; 王永

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Electrical & Electronic Engineering: Theses

postgraduate thesis: Neural machine translation with a unified framework of transferable models

Title	Neural machine translation with a unified framework of transferable models
Authors	Wang, Yong 王永
Advisors	Advisor(s):Li, VOK Huang, K
Issue Date	2020
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Wang, Y. [王永]. (2020). Neural machine translation with a unified framework of transferable models. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Teaching machines to communicate seamlessly across human languages is a longstanding challenge in artificial intelligence. Neural machine translation (NMT), which employs the encoder-decoder framework, has received increasing interests and achieved remarkable success recently. Unlike conventional pipelined statistical machine translation (SMT) with many separate components, NMT is conceptually simple and empirically powerful due to its end-to-end characteristic and strong flexibility during training and inference. This endows NMT with a particularly desirable ability to construct translation models with a unified framework in some practical scenarios, including multi-lingual, multi-domain and extremely low-resource. In this thesis, we focus on constructing NMT systems with an effective unified framework, which has the benefits of: 1) sharing knowledge across languages and domains effectively; 2) practical deployment with fewer parameters in the production. We present empirically effective and customized solutions for improving NMT systems with a unified framework. Extensive experiments on a variety of datasets demonstrate the effectiveness and universality of these proposed approaches. Contributions of this thesis include: 1) analyzing the issues of zero-shot translation quantitatively and successfully closing the gap of the performance between zero-shot translation and pivot-based translation; 2) proposing to explicitly transform domain knowledge for a multi-domain NMT model and achieving state-of-the-art performances in multi-domain NMT research; 3) proposing the large margin principle for the meta-learning algorithm, and pioneering the application of meta-learning to extremely low-resource translation in multi-lingual NMT successfully; 4) analyzing the underlying causes of why an NMT system with a unified framework enables knowledge sharing and transfer effectively from two aspects, namely, the model capacity in neural networks and the existence of redundant parameters in NMT systems. We discuss several promising research directions: 1) tackling the problems in real scenarios for low-resource language pairs; 2) incorporating the prevailing pre-training strategy into NMT systems; 3) closing the gap between parallel decoding and auto-regressive decoding; 4) removing the inductive bias of the decoder. We believe that these research directions hold great potential for future intelligent technologies.
Degree	Doctor of Philosophy
Subject	Neural networks (Computer science)
Dept/Program	Electrical and Electronic Engineering
Persistent Identifier	http://hdl.handle.net/10722/286006

DC Field	Value	Language
dc.contributor.advisor	Li, VOK	-
dc.contributor.advisor	Huang, K	-
dc.contributor.author	Wang, Yong	-
dc.contributor.author	王永	-
dc.date.accessioned	2020-08-25T08:43:53Z	-
dc.date.available	2020-08-25T08:43:53Z	-
dc.date.issued	2020	-
dc.identifier.citation	Wang, Y. [王永]. (2020). Neural machine translation with a unified framework of transferable models. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/286006	-
dc.description.abstract	Teaching machines to communicate seamlessly across human languages is a longstanding challenge in artificial intelligence. Neural machine translation (NMT), which employs the encoder-decoder framework, has received increasing interests and achieved remarkable success recently. Unlike conventional pipelined statistical machine translation (SMT) with many separate components, NMT is conceptually simple and empirically powerful due to its end-to-end characteristic and strong flexibility during training and inference. This endows NMT with a particularly desirable ability to construct translation models with a unified framework in some practical scenarios, including multi-lingual, multi-domain and extremely low-resource. In this thesis, we focus on constructing NMT systems with an effective unified framework, which has the benefits of: 1) sharing knowledge across languages and domains effectively; 2) practical deployment with fewer parameters in the production. We present empirically effective and customized solutions for improving NMT systems with a unified framework. Extensive experiments on a variety of datasets demonstrate the effectiveness and universality of these proposed approaches. Contributions of this thesis include: 1) analyzing the issues of zero-shot translation quantitatively and successfully closing the gap of the performance between zero-shot translation and pivot-based translation; 2) proposing to explicitly transform domain knowledge for a multi-domain NMT model and achieving state-of-the-art performances in multi-domain NMT research; 3) proposing the large margin principle for the meta-learning algorithm, and pioneering the application of meta-learning to extremely low-resource translation in multi-lingual NMT successfully; 4) analyzing the underlying causes of why an NMT system with a unified framework enables knowledge sharing and transfer effectively from two aspects, namely, the model capacity in neural networks and the existence of redundant parameters in NMT systems. We discuss several promising research directions: 1) tackling the problems in real scenarios for low-resource language pairs; 2) incorporating the prevailing pre-training strategy into NMT systems; 3) closing the gap between parallel decoding and auto-regressive decoding; 4) removing the inductive bias of the decoder. We believe that these research directions hold great potential for future intelligent technologies.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Neural networks (Computer science)	-
dc.title	Neural machine translation with a unified framework of transferable models	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Electrical and Electronic Engineering	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2020	-
dc.identifier.mmsid	991044264455403414	-

File Download

Supplementary

postgraduate thesis: Neural machine translation with a unified framework of transferable models

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats