What Is One Grain of Sand in the Desert? Analyzing Individual Neurons in Deep NLP Models-Reference-Cited by-同舟云学术

What Is One Grain of Sand in the Desert? Analyzing Individual Neurons in Deep NLP Models

Published:2019-07-17 Issue: Volume:33 Page:6309-6317
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Dalvi Fahim,Durrani Nadir,Sajjad Hassan,Belinkov Yonatan,Bau Anthony,Glass James

Abstract

Despite the remarkable evolution of deep neural networks in natural language processing (NLP), their interpretability remains a challenge. Previous work largely focused on what these models learn at the representation level. We break this analysis down further and study individual dimensions (neurons) in the vector representation learned by end-to-end neural models in NLP tasks. We propose two methods: Linguistic Correlation Analysis, based on a supervised method to extract the most relevant neurons with respect to an extrinsic task, and Cross-model Correlation Analysis, an unsupervised method to extract salient neurons w.r.t. the model itself. We evaluate the effectiveness of our techniques by ablating the identified neurons and reevaluating the network’s performance for two tasks: neural machine translation (NMT) and neural language modeling (NLM). We further present a comprehensive analysis of neurons with the aim to address the following questions: i) how localized or distributed are different linguistic properties in the models? ii) are certain neurons exclusive to some properties and not others? iii) is the information more or less distributed in NMT vs. NLM? and iv) how important are the neurons identified through the linguistic correlation method to the overall task? Our code is publicly available as part of the NeuroX toolkit (Dalvi et al. 2019a). This paper is a non-archived version of the paper published at AAAI (Dalvi et al. 2019b).

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 21 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. CR-CAM: Generating explanations for deep neural networks by contrasting and ranking features;Pattern Recognition;2024-05

2. Multimodal Large Language Models in Healthcare: Applications, Challenges, and Future Outlook (Preprint);Journal of Medical Internet Research;2024-04-13

3. The Impact of Activation Patterns in the Explainability of Large Language Models – A Survey of recent advances;Anais da XIX Escola Regional de Banco de Dados (ERBD 2024);2024-04-10

4. Explainability for Large Language Models: A Survey;ACM Transactions on Intelligent Systems and Technology;2024-02-22

5. What do end-to-end speech models learn about speaker, language and channel information? A layer-wise and neuron-level analysis;Computer Speech & Language;2024-01