Affiliation:
1. University of Maryland, College Park, MD, USA
2. Georgia Institute of Technology, Atlanta, GA, USA
3. University of Oklahoma, Norman, OK, USA
Abstract
We argue that the mismatch between data and analytical methods, along with common practices for dealing with “messy” data, can lead to inaccurate conclusions. Specifically, using previously published data on racial bias and culture of honor, we show that manifest effects, and therefore theoretical conclusions, are highly dependent on how researchers decide to handle extreme scores and nonlinearities when data are analyzed with traditional approaches. Within LS approaches, statistical effects appeared or disappeared on the basis of the inclusion or exclusion of as little as 1.5% (3 of 198) of the data, and highly predictive variables were masked by nonlinearities. We then demonstrate a new statistical modeling technique called the general monotone model (GeMM) and show that it has a number of desirable properties that may make it more appropriate for modeling messy data: It is more robust to extreme scores, less affected by outlier analyses, and more robust to violations of linearity on both the response and predictor variables compared with a variety of well-established statistical algorithms and frequently possesses greater statistical power. We argue that using procedures that make fewer assumptions about the data, such as GeMM, can lessen the need for researchers to use data-editing strategies (e.g., to apply transformations or to engage outlier analyses) on their data to satisfy often unrealistic statistical assumptions, leading to more consistent and accurate conclusions about data than traditional approaches of data analysis.
Subject
Sociology and Political Science
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献