File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Neural machine translation with a unified framework of transferable models
Title | Neural machine translation with a unified framework of transferable models |
---|---|
Authors | |
Advisors | |
Issue Date | 2020 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Wang, Y. [王永]. (2020). Neural machine translation with a unified framework of transferable models. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Teaching machines to communicate seamlessly across human languages is a longstanding challenge in artificial intelligence. Neural machine translation (NMT), which employs the encoder-decoder framework, has received increasing interests and achieved remarkable success recently. Unlike conventional pipelined statistical machine translation (SMT) with many separate components, NMT is conceptually simple and empirically powerful due to its end-to-end characteristic and strong flexibility during training and inference. This endows NMT with a particularly desirable ability to construct translation models with a unified framework in some practical scenarios, including multi-lingual, multi-domain and extremely low-resource.
In this thesis, we focus on constructing NMT systems with an effective unified framework, which has the benefits of: 1) sharing knowledge across languages and domains effectively; 2) practical deployment with fewer parameters in the production. We present empirically effective and customized solutions for improving NMT systems with a unified framework. Extensive experiments on a variety of datasets demonstrate the effectiveness and universality of these proposed approaches.
Contributions of this thesis include: 1) analyzing the issues of zero-shot translation quantitatively and successfully closing the gap of the performance between zero-shot translation and pivot-based translation; 2) proposing to explicitly transform domain knowledge for a multi-domain NMT model and achieving state-of-the-art performances in multi-domain NMT research; 3) proposing the large margin principle for the meta-learning algorithm, and pioneering the application of meta-learning to extremely low-resource translation in multi-lingual NMT successfully; 4) analyzing the underlying causes of why an NMT system with a unified framework enables knowledge sharing and transfer effectively from two aspects, namely, the model capacity in neural networks and the existence of redundant parameters in NMT systems.
We discuss several promising research directions: 1) tackling the problems in real scenarios for low-resource language pairs; 2) incorporating the prevailing pre-training strategy into NMT systems; 3) closing the gap between parallel decoding and auto-regressive decoding; 4) removing the inductive bias of the decoder. We believe that these research directions hold great potential for future intelligent technologies. |
Degree | Doctor of Philosophy |
Subject | Neural networks (Computer science) |
Dept/Program | Electrical and Electronic Engineering |
Persistent Identifier | http://hdl.handle.net/10722/286006 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Li, VOK | - |
dc.contributor.advisor | Huang, K | - |
dc.contributor.author | Wang, Yong | - |
dc.contributor.author | 王永 | - |
dc.date.accessioned | 2020-08-25T08:43:53Z | - |
dc.date.available | 2020-08-25T08:43:53Z | - |
dc.date.issued | 2020 | - |
dc.identifier.citation | Wang, Y. [王永]. (2020). Neural machine translation with a unified framework of transferable models. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/286006 | - |
dc.description.abstract | Teaching machines to communicate seamlessly across human languages is a longstanding challenge in artificial intelligence. Neural machine translation (NMT), which employs the encoder-decoder framework, has received increasing interests and achieved remarkable success recently. Unlike conventional pipelined statistical machine translation (SMT) with many separate components, NMT is conceptually simple and empirically powerful due to its end-to-end characteristic and strong flexibility during training and inference. This endows NMT with a particularly desirable ability to construct translation models with a unified framework in some practical scenarios, including multi-lingual, multi-domain and extremely low-resource. In this thesis, we focus on constructing NMT systems with an effective unified framework, which has the benefits of: 1) sharing knowledge across languages and domains effectively; 2) practical deployment with fewer parameters in the production. We present empirically effective and customized solutions for improving NMT systems with a unified framework. Extensive experiments on a variety of datasets demonstrate the effectiveness and universality of these proposed approaches. Contributions of this thesis include: 1) analyzing the issues of zero-shot translation quantitatively and successfully closing the gap of the performance between zero-shot translation and pivot-based translation; 2) proposing to explicitly transform domain knowledge for a multi-domain NMT model and achieving state-of-the-art performances in multi-domain NMT research; 3) proposing the large margin principle for the meta-learning algorithm, and pioneering the application of meta-learning to extremely low-resource translation in multi-lingual NMT successfully; 4) analyzing the underlying causes of why an NMT system with a unified framework enables knowledge sharing and transfer effectively from two aspects, namely, the model capacity in neural networks and the existence of redundant parameters in NMT systems. We discuss several promising research directions: 1) tackling the problems in real scenarios for low-resource language pairs; 2) incorporating the prevailing pre-training strategy into NMT systems; 3) closing the gap between parallel decoding and auto-regressive decoding; 4) removing the inductive bias of the decoder. We believe that these research directions hold great potential for future intelligent technologies. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Neural networks (Computer science) | - |
dc.title | Neural machine translation with a unified framework of transferable models | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Electrical and Electronic Engineering | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2020 | - |
dc.identifier.mmsid | 991044264455403414 | - |