From large language models to small logic programs: building global explanations from disagreeing local post-hoc explainers-Reference-Cited by-同舟云学术

From large language models to small logic programs: building global explanations from disagreeing local post-hoc explainers

Published:2024-07-08 Issue:2 Volume:38 Page:
ISSN:1387-2532
Container-title:Autonomous Agents and Multi-Agent Systems
language:en
Short-container-title:Auton Agent Multi-Agent Syst

Author:

Agiollo Andrea,Siebert Luciano Cavalcante,Murukannaiah Pradeep K.,Omicini Andrea

Abstract

AbstractThe expressive power and effectiveness of large language models (LLMs) is going to increasingly push intelligent agents towards sub-symbolic models for natural language processing (NLP) tasks in human–agent interaction. However, LLMs are characterised by a performance vs. transparency trade-off that hinders their applicability to such sensitive scenarios. This is the main reason behind many approaches focusing on local post-hoc explanations, recently proposed by the XAI community in the NLP realm. However, to the best of our knowledge, a thorough comparison among available explainability techniques is currently missing, as well as approaches for constructing global post-hoc explanations leveraging the local information. This is why we propose a novel framework for comparing state-of-the-art local post-hoc explanation mechanisms and for extracting logic programs surrogating LLMs. Our experiments—over a wide variety of text classification tasks—show how most local post-hoc explainers are loosely correlated, highlighting substantial discrepancies in their results. By relying on the proposed novel framework, we also show how it is possible to extract faithful and efficient global explanations for the original LLM over multiple tasks, enabling explainable and resource-friendly AI techniques.

Funder

EXPECTATION

FAIR—Future Artificial Intelligence Research

ENGINES — ENGineering INtElligent Systems around intelligent agent technologies

Alma Mater Studiorum - Università di Bologna

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s10458-024-09663-8.pdf

Reference61 articles.

1. Zhang, L., Wang, S., & Liu, B. (2018). Deep learning for sentiment analysis: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. https://doi.org/10.1002/widm.1253

2. Hao, T., Li, X., He, Y., Wang, F. L., & Qu, Y. (2022). Recent progress in leveraging deep learning methods for question answering. Neural Computing and Applications, 34(4), 2765–2783. https://doi.org/10.1007/s00521-021-06748-3

3. Otter, D. W., Medina, J. R., & Kalita, J. K. (2021). A survey of the usages of deep learning for natural language processing. IEEE Transactions on Neural Networks and Learning Systems, 32(2), 604–624. https://doi.org/10.1109/TNNLS.2020.2979670

4. Warstadt, A., Singh, A., & Bowman, S. R. (2019). Neural network acceptability judgments. Transactions of the Association for Computational Linguistics, 7, 625–641. https://doi.org/10.1162/tacl_a_00290

5. Stahlberg, F. (2020). Neural machine translation: A review. Journal of Artificial Intelligence Research, 69, 343–418. https://doi.org/10.1613/jair.1.12007