Evaluating Deep Learning Techniques for Natural Language Inference
-
Published:2023-02-16
Issue:4
Volume:13
Page:2577
-
ISSN:2076-3417
-
Container-title:Applied Sciences
-
language:en
-
Short-container-title:Applied Sciences
Author:
Eleftheriadis Petros1, Perikos Isidoros12ORCID, Hatzilygeroudis Ioannis1ORCID
Affiliation:
1. Computer Engineering and Informatics Department, University of Patras, 26504 Patras, Greece 2. Computer Technology Institute and Press “Diophantus”, 26504 Patras, Greece
Abstract
Natural language inference (NLI) is one of the most important natural language understanding (NLU) tasks. NLI expresses the ability to infer information during spoken or written communication. The NLI task concerns the determination of the entailment relation of a pair of sentences, called the premise and hypothesis. If the premise entails the hypothesis, the pair is labeled as an “entailment”. If the hypothesis contradicts the premise, the pair is labeled a “contradiction”, and if there is not enough information to infer a relationship, the pair is labeled as “neutral”. In this paper, we present experimentation results of using modern deep learning (DL) models, such as the pre-trained transformer BERT, as well as additional models that relay on LSTM networks, for the NLI task. We compare five DL models (and variations of them) on eight widely used NLI datasets. We trained and fine-tuned the hyperparameters for each model to achieve the best performance for each dataset, where we achieved some state-of-the-art results. Next, we examined the inference ability of the models on the BreakingNLI dataset, which evaluates the model’s ability to recognize lexical inferences. Finally, we tested the generalization power of our models across all the NLI datasets. The results of the study are quite interesting. In the first part of our experimentation, the results indicate the performance advantage of the pre-trained transformers BERT, RoBERTa, and ALBERT over other deep learning models. This became more evident when they were tested on the BreakingNLI dataset. We also see a pattern of improved performance when the larger models are used. However, ALBERT, given that it has 18 times fewer parameters, achieved quite remarkable performance.
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference42 articles.
1. MacCartney, B., and Manning, C.D. (2009, January 7–9). An Extended Model of Natural Logic. Proceedings of the Eight International Conference on Computational Semantics, Tilburg, The Netherlands. 2. de Marneffe, M.-C., Pado, S., and Manning, C.D. (2009, January 6). Multi-word expressions in textual inference: Much ado about nothing?. Proceedings of the 2009 Workshop on Applied Textual Inference, ACL-IJCNLP 2009, Suntec, Singapore. 3. Quiñonero-Candela, J., Dagan, I., Magnini, B., and d’Alché-Buc, F. (2006). Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Tectual Entailment, Proceedings of the First PASCAL Machine Learning Challenges Workshop, MLCW 2005, Southampton, UK, 11–13 April 2005, Springer. 4. Bowman, S.R., Angeli, G., Potts, C., and Manning, C. (2015). A large annotated corpus for learning natural language inference. arXiv. 5. Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., and Bowman, S. (2018). GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. arXiv.
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|