File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Universal Neural Machine Translation for Extremely Low Resource Languages

TitleUniversal Neural Machine Translation for Extremely Low Resource Languages
Authors
Issue Date2018
PublisherAssociation for Computational Linguistic.
Citation
The 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2018), New Orleans, Louisiana, 1-6 June 2018. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), p. 344-354 How to Cite?
AbstractIn this paper, we propose a new universal machine translation approach focusing on languages with a limited amount of parallel data. Our proposed approach utilizes a transfer-learning approach to share lexical and sentence level representations across multiple source languages into one target language. The lexical part is shared through a Universal Lexical Representation to support multi-lingual word-level sharing. The sentencelevel sharing is represented by a model of experts from all source languages that share the source encoders with all other languages. This enables the low-resource language to utilize the lexical and sentence representations of the higher resource languages. Our approach is able to achieve 23 BLEU on Romanian-English WMT2016 using a tiny parallel corpus of 6k sentences, compared to the 18 BLEU of strong baseline system which uses multi-lingual training and back-translation. Furthermore, we show that the proposed approach can achieve almost 20 BLEU on the same dataset through fine-tuning a pre-trained multi-lingual system in a zero-shot setting.
DescriptionOral: Machine Translation 1
Persistent Identifierhttp://hdl.handle.net/10722/261951

 

DC FieldValueLanguage
dc.contributor.authorGu, J-
dc.contributor.authorHassan, H-
dc.contributor.authorDevlin, J-
dc.contributor.authorLi, VOK-
dc.date.accessioned2018-09-28T04:50:51Z-
dc.date.available2018-09-28T04:50:51Z-
dc.date.issued2018-
dc.identifier.citationThe 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2018), New Orleans, Louisiana, 1-6 June 2018. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), p. 344-354-
dc.identifier.urihttp://hdl.handle.net/10722/261951-
dc.descriptionOral: Machine Translation 1-
dc.description.abstractIn this paper, we propose a new universal machine translation approach focusing on languages with a limited amount of parallel data. Our proposed approach utilizes a transfer-learning approach to share lexical and sentence level representations across multiple source languages into one target language. The lexical part is shared through a Universal Lexical Representation to support multi-lingual word-level sharing. The sentencelevel sharing is represented by a model of experts from all source languages that share the source encoders with all other languages. This enables the low-resource language to utilize the lexical and sentence representations of the higher resource languages. Our approach is able to achieve 23 BLEU on Romanian-English WMT2016 using a tiny parallel corpus of 6k sentences, compared to the 18 BLEU of strong baseline system which uses multi-lingual training and back-translation. Furthermore, we show that the proposed approach can achieve almost 20 BLEU on the same dataset through fine-tuning a pre-trained multi-lingual system in a zero-shot setting.-
dc.languageeng-
dc.publisherAssociation for Computational Linguistic.-
dc.relation.ispartofProceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)-
dc.titleUniversal Neural Machine Translation for Extremely Low Resource Languages-
dc.typeConference_Paper-
dc.identifier.emailLi, VOK: vli@eee.hku.hk-
dc.identifier.authorityLi, VOK=rp00150-
dc.description.naturelink_to_OA_fulltext-
dc.identifier.doi10.18653/v1/N18-1032-
dc.identifier.hkuros292168-
dc.identifier.hkuros306542-
dc.identifier.volume1-
dc.identifier.spage344-
dc.identifier.epage354-
dc.publisher.placeNew Orleans, Louisiana-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats