Enhanced neural machine translation with external resources

Chen, Guanhua; 陳冠華

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Computer Science: Theses

postgraduate thesis: Enhanced neural machine translation with external resources

Title	Enhanced neural machine translation with external resources
Authors	Chen, Guanhua 陳冠華
Advisors	Advisor(s):Pan, J Wang, WP
Issue Date	2022
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Chen, G. [陳冠華]. (2022). Enhanced neural machine translation with external resources. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Neural machine translation (NMT) is the task of translating a source language sentence into a target language with neural networks. The NMT has achieved superior performance to conventional machine translation approaches and become the dominative commercial translation system due to its impressive performance as well as advantages in training and deployment. However, the NMT still faces several challenges. First, due to its end-to-end nature, it is hard to incorporate lexical constraints, namely pre-specified translation fragments into NMT, which is useful for domain adaptation and interactive NMT. Second, as a data-driven approach, adequate training of the NMT model requires large-scale parallel datasets. Therefore, for low-resource language pairs where parallel sentences are limited, the NMT performance degrades significantly. In this thesis, we aim at enhancing NMT with external resources such as lexical constraints and a multilingual pretrained encoder. We discuss the following topics thoroughly: (1) Incorporating the lexical constraints into NMT. We propose two lexically constrained NMT methods either through data augmentation or the designs of novel decoding algorithms. The former approach trains the NMT model with augmented training data using sampled target phrases as constraints, while the latter method modifies the decoding algorithm using more accurate word alignment. (2) Incorporating the multilingual pretrained encoder into NMT. For this research, we first propose SixT, a zero-shot multilingual NMT model which incorporates a multilingual pretrained encoder into NMT with a position-disentangled encoder and capacity-enhanced decoder. SixT is trained with a novel two-stage transferability-enhanced training framework. Then we extend SixT to multilingual fine-tuning and propose SixT+. SixT+ is not only a multilingual NMT model, but can also serve as a pretrained model for different downstream cross-lingual text generation tasks, such as unsupervised machine translation for extremely low-resource languages and zero-shot cross-lingual abstractive summarization. We compare the proposed models against baselines on various datasets and language pairs with extensive experiments. The results show that our approaches significantly outperform the baselines, demonstrating that we effectively incorporate external resources for improving neural machine translation. Moreover, we conduct a series of in-depth analyses to better understand our proposed methods. All codes for these works are publicly available.
Degree	Doctor of Philosophy
Subject	Neural networks (Computer science) Machine translating
Dept/Program	Computer Science
Persistent Identifier	http://hdl.handle.net/10722/318423

DC Field	Value	Language
dc.contributor.advisor	Pan, J	-
dc.contributor.advisor	Wang, WP	-
dc.contributor.author	Chen, Guanhua	-
dc.contributor.author	陳冠華	-
dc.date.accessioned	2022-10-10T08:18:57Z	-
dc.date.available	2022-10-10T08:18:57Z	-
dc.date.issued	2022	-
dc.identifier.citation	Chen, G. [陳冠華]. (2022). Enhanced neural machine translation with external resources. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/318423	-
dc.description.abstract	Neural machine translation (NMT) is the task of translating a source language sentence into a target language with neural networks. The NMT has achieved superior performance to conventional machine translation approaches and become the dominative commercial translation system due to its impressive performance as well as advantages in training and deployment. However, the NMT still faces several challenges. First, due to its end-to-end nature, it is hard to incorporate lexical constraints, namely pre-specified translation fragments into NMT, which is useful for domain adaptation and interactive NMT. Second, as a data-driven approach, adequate training of the NMT model requires large-scale parallel datasets. Therefore, for low-resource language pairs where parallel sentences are limited, the NMT performance degrades significantly. In this thesis, we aim at enhancing NMT with external resources such as lexical constraints and a multilingual pretrained encoder. We discuss the following topics thoroughly: (1) Incorporating the lexical constraints into NMT. We propose two lexically constrained NMT methods either through data augmentation or the designs of novel decoding algorithms. The former approach trains the NMT model with augmented training data using sampled target phrases as constraints, while the latter method modifies the decoding algorithm using more accurate word alignment. (2) Incorporating the multilingual pretrained encoder into NMT. For this research, we first propose SixT, a zero-shot multilingual NMT model which incorporates a multilingual pretrained encoder into NMT with a position-disentangled encoder and capacity-enhanced decoder. SixT is trained with a novel two-stage transferability-enhanced training framework. Then we extend SixT to multilingual fine-tuning and propose SixT+. SixT+ is not only a multilingual NMT model, but can also serve as a pretrained model for different downstream cross-lingual text generation tasks, such as unsupervised machine translation for extremely low-resource languages and zero-shot cross-lingual abstractive summarization. We compare the proposed models against baselines on various datasets and language pairs with extensive experiments. The results show that our approaches significantly outperform the baselines, demonstrating that we effectively incorporate external resources for improving neural machine translation. Moreover, we conduct a series of in-depth analyses to better understand our proposed methods. All codes for these works are publicly available.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Neural networks (Computer science)	-
dc.subject.lcsh	Machine translating	-
dc.title	Enhanced neural machine translation with external resources	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Computer Science	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2022	-
dc.identifier.mmsid	991044600200403414	-

File Download

Supplementary

postgraduate thesis: Enhanced neural machine translation with external resources

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats