Reconciliation of inconsistent data sources using hidden Markov models

Author:

Pankowska Paulina1,Pavlopoulos Dimitris1,Bakker Bart12,Oberski Daniel L.3

Affiliation:

1. Vrije Universiteit Amsterdam, The Netherlands

2. Statistics Netherlands, The Netherlands

3. Utrecht University, University Medical Center Utrecht, The Netherlands

Abstract

This paper discusses how National Statistical Institutes (NSI’s) can use hidden Markov models (HMMs) to produce consistent official statistics for categorical, longitudinal variables using inconsistent sources. Two main challenges are addressed: first, the reconciliation of inconsistent sources with multi-indicator HMMs requires linking the sources on the micro level. Such linkage might lead to bias due to linkage error. Second, applying and estimating HMMs regularly is a complicated and expensive procedure. Therefore, it is preferable to use the error parameter estimates as a correction factor for a number of years. However, this might lead to biased structural estimates if measurement error changes over time or if the data collection process changes. Our results on these issues are highly encouraging and imply that the suggested method is appropriate for NSI’s. Specifically, linkage error only leads to (substantial) bias in very extreme scenarios. Moreover, measurement error parameters are largely stable over time if no major changes in the data collection process occur. However, when a substantial change in the data collection process occurs, such as a switch from dependent (DI) to independent (INDI) interviewing, re-using measurement error estimates is not advisable.

Publisher

IOS Press

Subject

Statistics, Probability and Uncertainty,Economics and Econometrics,Management Information Systems

Reference36 articles.

1. Analysing correspondence between administrative and survey data;var Delden;Statistical Journal of the IAOS,2016

2. Obtaining numerically consistent estimates from a mix of administrative data and surveys;de Waal;Statistical Journal of the IAOS,2016

3. Estimation from contaminated multi-source data based on latent class models;Guarnera;Statistical Journal of the IAOS,2016

4. Thinking about answers: The application of cognitive processes to survey methodology;Sudman;Psyccritiques,1997

5. Estimating the validity of administrative variables;Bakker;Statistica Neerlandica,2012

Cited by 3 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3