Pushing the Limits of Rule Reasoning in Transformers through Natural Language Satisfiability-Reference-Cited by-同舟云学术

Pushing the Limits of Rule Reasoning in Transformers through Natural Language Satisfiability

Published:2022-06-28 Issue:10 Volume:36 Page:11209-11219
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Richardson Kyle,Sabharwal Ashish

Abstract

Investigating the reasoning abilities of transformer models, and discovering new challenging tasks for them, has been a topic of much interest. Recent studies have found these models to be surprisingly strong at performing deductive reasoning over formal logical theories expressed in natural language. A shortcoming of these studies, however, is that they do not take into account that logical theories, when sampled uniformly at random, do not necessarily lead to hard instances. We propose a new methodology for creating challenging algorithmic reasoning datasets that focus on natural language satisfiability (NLSat) problems. The key idea is to draw insights from empirical sampling of hard propositional SAT problems and from complexity-theoretic studies of language. This methodology allows us to distinguish easy from hard instances, and to systematically increase the complexity of existing reasoning benchmarks such as RuleTaker. We find that current transformers, given sufficient training data, are surprisingly robust at solving the resulting NLSat problems of substantially increased difficulty. They also exhibit some degree of scale-invariance—the ability to generalize to problems of larger size and scope. Our results, however, reveal important limitations too: careful sampling of training data is crucial for building models that generalize to larger problems, and transformer models’ limited scale-invariance suggests they are far from learning robust deductive reasoning algorithms.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Genetic Algorithm For Boolean Semiring Matrix Factorization With Applications To Graph Mining;2022 IEEE International Conference on Big Data (Big Data);2022-12-17