QASC: A Dataset for Question Answering via Sentence Composition-Reference-Cited by-同舟云学术

QASC: A Dataset for Question Answering via Sentence Composition

Published:2020-04-03 Issue:05 Volume:34 Page:8082-8090
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Khot Tushar,Clark Peter,Guerquin Michal,Jansen Peter,Sabharwal Ashish

Abstract

Composing knowledge from multiple pieces of texts is a key challenge in multi-hop question answering. We present a multi-hop reasoning dataset, Question Answering via Sentence Composition (QASC), that requires retrieving facts from a large corpus and composing them to answer a multiple-choice question. QASC is the first dataset to offer two desirable properties: (a) the facts to be composed are annotated in a large corpus, and (b) the decomposition into these facts is not evident from the question itself. The latter makes retrieval challenging as the system must introduce new concepts or relations in order to discover potential decompositions. Further, the reasoning model must then learn to identify valid compositions of these retrieved facts using common-sense reasoning. To help address these challenges, we provide annotation for supporting facts as well as their composition. Guided by these annotations, we present a two-step approach to mitigate the retrieval challenges. We use other multiple-choice datasets as additional training data to strengthen the reasoning model. Our proposed approach improves over current state-of-the-art language models by 11% (absolute). The reasoning and retrieval problems, however, remain unsolved as this model still lags by 20% behind human performance.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 29 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. bjEnet: a fast and accurate software bug localization method in natural language semantic space;Software Quality Journal;2024-07-22

2. Option-Differentiated Clue Augmentation for Commonsense Question Answering;2024 International Joint Conference on Neural Networks (IJCNN);2024-06-30

3. ArQuAD: An Expert-Annotated Arabic Machine Reading Comprehension Dataset;Cognitive Computation;2024-03-11

4. Datasets for Large Language Models: A Comprehensive Survey;2024-03-04

5. Explainable Product Classification for Customs;ACM Transactions on Intelligent Systems and Technology;2023-12