From ‘F’ to ‘A’ on the N.Y. Regents Science Exams: An Overview of the Aristo Project-Reference-Cited by-同舟云学术

From ‘F’ to ‘A’ on the N.Y. Regents Science Exams: An Overview of the Aristo Project

Published:2020-12-28 Issue:4 Volume:41 Page:39-53
ISSN:2371-9621
Container-title:AI Magazine
language:
Short-container-title:AIMag

Author:

Clark Peter,Etzioni Oren,Khot Tushar,Khashabi Daniel,Mishra Bhavana,Richardson Kyle,Sabharwal Ashish,Schoenick Carissa,Schoenick Carissa,Tafjord Oyvind,Tandon Niket,Bhakthavatsalam Sumithra,Groeneveld Dirk,Guerquin Michal,Schmitz Michael

Abstract

AI has achieved remarkable mastery over games such as Chess, Go, and Poker, and even Jeopardy!, but the rich variety of standardized exams has remained a landmark challenge. Even as recently as 2016, the best AI system could achieve merely 59.3 percent on an 8th grade science exam. This article reports success on the Grade 8 New York Regents Science Exam, where for the first time a system scores more than 90 percent on the exam’s nondiagram, multiple choice (NDMC) questions. In addition, our Aristo system, building upon the success of recent language models, exceeded 83 percent on the corresponding Grade 12 Science Exam NDMC questions. The results, on unseen test questions, are robust across different test years and different variations of this kind of test. They demonstrate that modern natural language processing methods can result in mastery on this task. While not a full solution to general question-answering (the questions are limited to 8th grade multiple-choice science) it represents a significant milestone for the field.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

Artificial Intelligence

Cited by 16 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. G-SAP: Graph-based Structure-Aware Prompt Learning over Heterogeneous Knowledge for Commonsense Reasoning;Proceedings of the 2024 International Conference on Multimedia Retrieval;2024-05-30

2. Knowledge-aware adaptive graph network for commonsense question answering;Journal of Intelligent Information Systems;2024-03-19

3. Heterogeneous-Graph Reasoning With Context Paraphrase for Commonsense Question Answering;IEEE/ACM Transactions on Audio, Speech, and Language Processing;2024

4. Subgraph Retrieval Enhanced by Graph-Text Alignment for Commonsense Question Answering;Lecture Notes in Computer Science;2024

5. How to Use Language Expert to Assist Inference for Visual Commonsense Reasoning;2023 IEEE International Conference on Data Mining Workshops (ICDMW);2023-12-04