SciTaiL: A Textual Entailment Dataset from Science Question Answering-Reference-Cited by-同舟云学术

SciTaiL: A Textual Entailment Dataset from Science Question Answering

Published:2018-04-27 Issue:1 Volume:32 Page:
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Khot Tushar,Sabharwal Ashish,Clark Peter

Abstract

We present a new dataset and model for textual entailment, derived from treating multiple-choice question-answering as an entailment problem. SciTail is the first entailment set that is created solely from natural sentences that already exist independently ``in the wild'' rather than sentences authored specifically for the entailment task. Different from existing entailment datasets, we create hypotheses from science questions and the corresponding answer candidates, and premises from relevant web sentences retrieved from a large corpus. These sentences are often linguistically challenging. This, combined with the high lexical similarity of premise and hypothesis for both entailed and non-entailed pairs, makes this new entailment task particularly difficult. The resulting challenge is evidenced by state-of-the-art textual entailment systems achieving mediocre performance on SciTail, especially in comparison to a simple majority class baseline. As a step forward, we demonstrate that one can improve accuracy on SciTail by 5% using a new neural model that exploits linguistic structure.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 55 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. How rationals boost textual entailment modeling: Insights from large language models;Computers and Electrical Engineering;2024-10

2. Incorporating external knowledge for text matching model;Computer Speech & Language;2024-08

3. Explanation based Bias Decoupling Regularization for Natural Language Inference;2024 International Joint Conference on Neural Networks (IJCNN);2024-06-30

4. LMCK: pre-trained language models enhanced with contextual knowledge for Vietnamese natural language inference;Multimedia Tools and Applications;2024-06-22

5. A Survey of Text-Matching Techniques;Information;2024-06-05