Affiliation:
1. Statistics Canada, SRID, 100 Tunney’s Pasture, Ottawa, Ontario K1A0T6, Canada
Abstract
Abstract
This article looks at the estimation of an association parameter between two variables in a finite population, when the variables are separately recorded in two population registers that are also imperfectly linked. The main problem is the occurrence of linkage errors that include bad links and missing links. A methodology is proposed when clerical-reviews may reliably determine the match status of a record-pair, for example using names, demographic and address information. It features clerical-reviews on a probability sample of pairs and regression estimators that are assisted by a statistical model of comparison outcomes in a pair. Like other regression estimators, this estimator is design-consistent regardless of the model validity. It is also more efficient when the model holds.
Reference22 articles.
1. Belin, T.R. and D.B. Rubin. 1995. “A Method for Calibrating False-Match Rates in Record Linkage.” Journal of the American Statistical Association 90: 694-707. Doi: http://dx.doi.org/10.2307/2291082.10.1080/01621459.1995.10476563
2. Chambers, R. 2009. “Regression Analysis of Probability-Linked Data.” Official Statistics Research Series, vol. 4.
3. Chipperfield, J.O., G.R. Bishop, and P. Campbell. 2011. “Maximum Likelihood Estimation for Contingency Tables and Logistic Regression with Incorrectly Linked Data.” Survey Methodology 37: 13-24.
4. Dempster, A., N. Laird, and D. Rubin. 1977. “Maximum Likelihood from Incomplete Data via the EM Algorithm.” Journal of the Royal Statistical Society Series B 39: 1-38. Available at: http://www.jstor.org/stable/2984875 (accessed November 2017).10.1111/j.2517-6161.1977.tb01600.x
5. Deville, J.-C. and C.-E. Sa¨rndal. 1992. “Calibration Estimators in Survey Sampling”. Journal of the American Statistical Association 37: 376-382.10.1080/01621459.1992.10475217