Affiliation:
1. RTI International, Washington, DC, USA
2. University of Mannheim, Germany
3. University of Maryland, College Park, MD, USA
Abstract
Social media are becoming more popular as a source of data for social science researchers. These data are plentiful and offer the potential to answer new research questions at smaller geographies and for rarer subpopulations. When deciding whether to use data from social media, it is useful to learn as much as possible about the data and its source. Social media data have properties quite different from those with which many social scientists are used to working, so the assumptions often used to plan and manage a project may no longer hold. For example, social media data are so large that they may not be able to be processed on a single machine; they are in file formats with which many researchers are unfamiliar, and they require a level of data transformation and processing that has rarely been required when using more traditional data sources (e.g., survey data). Unfortunately, this type of information is often not obvious ahead of time as much of this knowledge is gained through word-of-mouth and experience. In this article, we attempt to document several challenges and opportunities encountered when working with Reddit, the self-proclaimed “front page of the Internet” and popular social media site. Specifically, we provide descriptive information about the Reddit site and its users, tips for using organic data from Reddit for social science research, some ideas for conducting a survey on Reddit, and lessons learned in merging survey responses with Reddit posts. While this article is specific to Reddit, researchers may also view it as a list of the type of information one may seek to acquire prior to conducting a project that uses any type of social media data.
Subject
Law,Library and Information Sciences,Computer Science Applications,General Social Sciences
Cited by
80 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献