Interpreting and analyzing neural networks for NLP : a knowledge management perspective

Wu, Zhiyong; 吳志勇

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Computer Science: Theses

postgraduate thesis: Interpreting and analyzing neural networks for NLP : a knowledge management perspective

Title	Interpreting and analyzing neural networks for NLP : a knowledge management perspective
Authors	Wu, Zhiyong 吳志勇
Issue Date	2021
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Wu, Z. [吳志勇]. (2021). Interpreting and analyzing neural networks for NLP : a knowledge management perspective. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Language technology has become pervasive in everyday life, with applications like Google Search, Amazon Alexa, Apple Siri, etc. The success of such Natural Language Processing (NLP) systems is built on the advance in deep neural networks, thus it also comes at the expense of model becoming less interpretable, i.e., such models are often perceived as ``black box''. The lack of model transparency not only hinders future research effort in advancing the field, but also limit these models' utility in applications that have certain requirement in terms of reliability, ethic, and legitimacy. Opening up the black box of neural NLP models has attracted attention from different NLP sub-field. This thesis offers a novel perspective and investigates methods to analyze how NLP models cope with real-world knowledge, since the ability to connect textual symbols with human knowledge and exploit these knowledge in decision making is the key to NLP models' success. Toward this goal, we first develop perturbed masking -- a parameter-free method for analyzing and interpreting a recent prevalent pre-trained language model -- BERT. The analysis quantifies the linguistic knowledge BERT captured and sheds light on its remarkable success in many NLP tasks. Second, we present a directly interpretable Multimodal Machine Translation (MMT) model, which consists of an interpretable component that quantifies its dependency on visual knowledge. Our findings stress the importance of interpretability in MMT and suggest potential directions for improvement. Third, we explore how interpretability can lead to a better model design and demonstrate with an application on knowledge-based question answering.
Degree	Doctor of Philosophy
Subject	Neural networks (Computer science) Natural language processing (Computer science)
Dept/Program	Computer Science
Persistent Identifier	http://hdl.handle.net/10722/310288

DC Field	Value	Language
dc.contributor.author	Wu, Zhiyong	-
dc.contributor.author	吳志勇	-
dc.date.accessioned	2022-01-29T16:16:04Z	-
dc.date.available	2022-01-29T16:16:04Z	-
dc.date.issued	2021	-
dc.identifier.citation	Wu, Z. [吳志勇]. (2021). Interpreting and analyzing neural networks for NLP : a knowledge management perspective. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/310288	-
dc.description.abstract	Language technology has become pervasive in everyday life, with applications like Google Search, Amazon Alexa, Apple Siri, etc. The success of such Natural Language Processing (NLP) systems is built on the advance in deep neural networks, thus it also comes at the expense of model becoming less interpretable, i.e., such models are often perceived as ``black box''. The lack of model transparency not only hinders future research effort in advancing the field, but also limit these models' utility in applications that have certain requirement in terms of reliability, ethic, and legitimacy. Opening up the black box of neural NLP models has attracted attention from different NLP sub-field. This thesis offers a novel perspective and investigates methods to analyze how NLP models cope with real-world knowledge, since the ability to connect textual symbols with human knowledge and exploit these knowledge in decision making is the key to NLP models' success. Toward this goal, we first develop perturbed masking -- a parameter-free method for analyzing and interpreting a recent prevalent pre-trained language model -- BERT. The analysis quantifies the linguistic knowledge BERT captured and sheds light on its remarkable success in many NLP tasks. Second, we present a directly interpretable Multimodal Machine Translation (MMT) model, which consists of an interpretable component that quantifies its dependency on visual knowledge. Our findings stress the importance of interpretability in MMT and suggest potential directions for improvement. Third, we explore how interpretability can lead to a better model design and demonstrate with an application on knowledge-based question answering.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Neural networks (Computer science)	-
dc.subject.lcsh	Natural language processing (Computer science)	-
dc.title	Interpreting and analyzing neural networks for NLP : a knowledge management perspective	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Computer Science	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2022	-
dc.identifier.mmsid	991044467224703414	-

File Download

Supplementary

postgraduate thesis: Interpreting and analyzing neural networks for NLP : a knowledge management perspective

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats