Abstract
In the Big Data era, sampling remains a central theme. This paper investigates the characteristics of inverse sampling on two different datasets (real and simulated) to determine when big data become too small for inverse sampling to be used and to examine the impact of the sampling rate of the subsamples. We find that the method, using the appropriate subsample size for both the mean and proportion parameters, performs well with a smaller dataset than big data through the simulation study and real-data application. Different settings related to the selection bias severity are considered during the simulation study and real application.
Reference25 articles.
1. What is Big Data? A Consensual Definition and a Review of Key Research Topics;AIP Conf. Proc.,2015
2. Horrigan, M.W. (AMSTAT News, 2013). Big Data: A Perspective From the BLS, AMSTAT News.
3. Kish, L. (1995). Survey Sampling, J. Wiley & Sons.
4. Is Bigger Always Better? Potential Biases of Big Data Derived from Social Network Sites;ANNALS Am. Acad. Political Soc. Sci.,2015
5. Small Area Model-Based Estimators Using Big Data Sources;ANNALS Am. Acad. Political Soc. Sci.,2015
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献