Evaluating Deep Learning Techniques for Natural Language Inference-Reference-Cited by-同舟云学术

Evaluating Deep Learning Techniques for Natural Language Inference

Published:2023-02-16 Issue:4 Volume:13 Page:2577
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Eleftheriadis Petros¹,Perikos Isidoros¹²^ORCID,Hatzilygeroudis Ioannis¹^ORCID

Affiliation:

1. Computer Engineering and Informatics Department, University of Patras, 26504 Patras, Greece

2. Computer Technology Institute and Press “Diophantus”, 26504 Patras, Greece

Abstract

Natural language inference (NLI) is one of the most important natural language understanding (NLU) tasks. NLI expresses the ability to infer information during spoken or written communication. The NLI task concerns the determination of the entailment relation of a pair of sentences, called the premise and hypothesis. If the premise entails the hypothesis, the pair is labeled as an “entailment”. If the hypothesis contradicts the premise, the pair is labeled a “contradiction”, and if there is not enough information to infer a relationship, the pair is labeled as “neutral”. In this paper, we present experimentation results of using modern deep learning (DL) models, such as the pre-trained transformer BERT, as well as additional models that relay on LSTM networks, for the NLI task. We compare five DL models (and variations of them) on eight widely used NLI datasets. We trained and fine-tuned the hyperparameters for each model to achieve the best performance for each dataset, where we achieved some state-of-the-art results. Next, we examined the inference ability of the models on the BreakingNLI dataset, which evaluates the model’s ability to recognize lexical inferences. Finally, we tested the generalization power of our models across all the NLI datasets. The results of the study are quite interesting. In the first part of our experimentation, the results indicate the performance advantage of the pre-trained transformers BERT, RoBERTa, and ALBERT over other deep learning models. This became more evident when they were tested on the BreakingNLI dataset. We also see a pattern of improved performance when the larger models are used. However, ALBERT, given that it has 18 times fewer parameters, achieved quite remarkable performance.

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/13/4/2577/pdf

Reference42 articles.

1. MacCartney, B., and Manning, C.D. (2009, January 7–9). An Extended Model of Natural Logic. Proceedings of the Eight International Conference on Computational Semantics, Tilburg, The Netherlands.

2. de Marneffe, M.-C., Pado, S., and Manning, C.D. (2009, January 6). Multi-word expressions in textual inference: Much ado about nothing?. Proceedings of the 2009 Workshop on Applied Textual Inference, ACL-IJCNLP 2009, Suntec, Singapore.

3. Quiñonero-Candela, J., Dagan, I., Magnini, B., and d’Alché-Buc, F. (2006). Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Tectual Entailment, Proceedings of the First PASCAL Machine Learning Challenges Workshop, MLCW 2005, Southampton, UK, 11–13 April 2005, Springer.

4. Bowman, S.R., Angeli, G., Potts, C., and Manning, C. (2015). A large annotated corpus for learning natural language inference. arXiv.

5. Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., and Bowman, S. (2018). GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. arXiv.

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Hybrid Models for Recognizing Indonesian Textual Entailment;2024 7th International Conference on Informatics and Computational Sciences (ICICoS);2024-07-17

2. Deep Learning for Detecting Entailment Between Requirements Using Semantics from Use Case Diagrams as Training Data: A Comparative Study;2024 International Seminar on Intelligent Technology and Its Applications (ISITIA);2024-07-10

3. ArEntail: manually-curated Arabic natural language inference dataset from news headlines;Language Resources and Evaluation;2024-04-22

4. Biomedical Natural Language Inference on Clinical trials using the BERT-based Models;Procedia Computer Science;2024