Abstract
Web-based survey data collection has become increasingly popular, and limitations on in-person data collection during the COVID-19 pandemic have fueled this growth. However, the anonymity of the online environment increases the risk of fraudulent responses provided by bots or those who complete surveys to receive incentives, a major risk to data integrity. As part of a study of COVID-19 and the return to in-person school, we implemented a web-based survey of parents in Maryland between December 2021 and July 2022. Recruitment relied, in part, on social media advertisements. Despite implementing many existing best practices, we found the survey challenged by sophisticated fraudsters. In response, we iteratively improved survey security. In this paper, we describe efforts to identify and prevent fraudulent online survey responses. Informed by this experience, we provide specific, actionable recommendations for identifying and preventing online survey fraud in future research. Some strategies can be deployed within the data collection platform such as careful crafting of survey links, Internet Protocol address logging to identify duplicate responses, and comparison of client-side and server-side time stamps to identify responses that may have been completed by respondents outside of the survey’s target geography. Other strategies can be implemented during the survey design phase. These approaches include the use of a 2-stage design in which respondents must be eligible on a preliminary screener before receiving a personalized link. Other design-based strategies include within-survey and cross-survey validation questions, the addition of “speed bump” questions to thwart careless or computerized responders, and the use of optional open-ended survey questions to identify fraudsters. We describe best practices for ongoing monitoring and post-completion survey data review and verification, including algorithms to expedite some aspects of data review and quality assurance. Such strategies are increasingly critical to safeguarding survey-based public health research.
Funder
National Institutes of Health
Publisher
Public Library of Science (PLoS)
Reference15 articles.
1. Threats of Bots and Other Bad Actors to Data Quality Following Research Participant Recruitment Through Social Media: Cross-Sectional Questionnaire;R Pozzar,2020
2. Ensuring Survey Research Data Integrity in the era of internet bots.;M Griffin;Quality & Quantity,2021
3. Social media as a recruitment platform for a nationwide online survey of covid-19 knowledge, beliefs, and practices in the United States: Methodology and feasibility analysis (preprint);SH Ali,2020
4. Research Electronic Data Capture (redcap)—a metadata-driven methodology and workflow process for providing Translational Research Informatics Support;PA Harris;Journal of Biomedical Informatics,2009
5. Digitizing clinical trials;OT Inan,2020
Cited by
12 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献