Affiliation:
1. Stanford University, United States. mhahn2@stanford.edu
2. Stanford University, United States. jurafsky@stanford.edu
3. University of California, Irvine, United States. rfutrell@uci.edu
Abstract
Abstract
We introduce a theoretical framework for understanding and predicting the complexity of sequence classification tasks, using a novel extension of the theory of Boolean function sensitivity. The sensitivity of a function, given a distribution over input sequences, quantifies the number of disjoint subsets of the input sequence that can each be individually changed to change the output. We argue that standard sequence classification methods are biased towards learning low-sensitivity functions, so that tasks requiring high sensitivity are more difficult. To that end, we show analytically that simple lexical classifiers can only express functions of bounded sensitivity, and we show empirically that low-sensitivity functions are easier to learn for LSTMs. We then estimate sensitivity on 15 NLP tasks, finding that sensitivity is higher on challenging tasks collected in GLUE than on simple text classification tasks, and that sensitivity predicts the performance both of simple lexical classifiers and of vanilla BiLSTMs without pretrained contextualized embeddings. Within a task, sensitivity predicts which inputs are hard for such simple models. Our results suggest that the success of massively pretrained contextual representations stems in part because they provide representations from which information can be extracted by low-sensitivity decoders.
Subject
Artificial Intelligence,Computer Science Applications,Linguistics and Language,Human-Computer Interaction,Communication
Reference70 articles.
1. A simple but tough-to-beat baseline for sentence embeddings;Arora,2017
2. Sensitivity vs. block sensitivity (an average-case study);Bernasconi;Information Processing Letters,1996
3. Towards understanding the spectral bias of deep learning;Cao;arXiv preprint arXiv:1912.01198,2019
4. Semeval-2017 task 1: Semantic textual similarity multilingual and crosslingual focused evaluation;Cer,2017
5. One billion word benchmark for measuring progress in statistical language modeling;Chelba,2014
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献