Author:
BENOIT KENNETH,CONWAY DREW,LAUDERDALE BENJAMIN E.,LAVER MICHAEL,MIKHAYLOV SLAVA
Abstract
Empirical social science often relies on data that are not observed in the field, but are transformed into quantitative variables by expert researchers who analyze and interpret qualitative raw sources. While generally considered the most valid way to produce data, this expert-driven process is inherently difficult to replicate or to assess on grounds of reliability. Using crowd-sourcing to distribute text for reading and interpretation by massive numbers of nonexperts, we generate results comparable to those using experts to read and interpret the same texts, but do so far more quickly and flexibly. Crucially, the data we collect can be reproduced and extended transparently, making crowd-sourced datasets intrinsically reproducible. This focuses researchers’ attention on the fundamental scientific objective of specifying reliable and replicable methods for collecting the data needed, rather than on the content of any particular dataset. We also show that our approach works straightforwardly with different types of political text, written in different languages. While findings reported here concern text analysis, they have far-reaching implications for expert-generated data in the social sciences.
Publisher
Cambridge University Press (CUP)
Subject
Political Science and International Relations,Sociology and Political Science
Reference61 articles.
1. Welinder P. , S. Branson , S. Belongie , and P. Perona . 2010. “The Multidimensional Wisdom of Crowds.” Paper read at Advances in Neural Information Processing Systems 23 (NIPS 2010).
2. Jones Frank R. , and Bryan D. Baumgartner . 2013. “Policy Agendas Project.”
3. Party Policy in Modern Democracies
4. Sheng V. , F. Provost , and Panagiotis Ipeirotis . 2008. “Get Another Label? Improving Data Quality and Data Mining using Multiple, Noisy Labelers.” Paper read at Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
5. Snow R. , B. O'Connor , D. Jurafsky , and A. Ng . 2008. “Cheap and Fast—But is it Good?: Evaluating Non-expert Annotations for Natural Language Tasks.” Paper read at Proceedings of the Conference on Empirical Methods in Natural Language Processing.
Cited by
181 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献