Evaluation of Record Linkage Methods for Iterative Insertions-Reference-Cited by-同舟云学术

Evaluation of Record Linkage Methods for Iterative Insertions

Published:2009 Issue:05 Volume:48 Page:429-437
ISSN:0026-1270
Container-title:Methods of Information in Medicine
language:en
Short-container-title:Methods Inf Med

Author:

Borg A.,Pommerening K.,Sariyar M.

Abstract

Summary Objectives: There have been many developments and applications of mathematical methods in the context of record linkage as one area of interdisciplinary research efforts. However, comparative evaluations of record linkage methods are still underrepresented. In this paper improvements of the Fellegi-Sunter model are compared with other elaborated classification methods in order to direct further research endeavors to the most promising methodologies. Methods: The task of linking records can be viewed as a special form of object identification. We consider several non-stochastic methods and procedures for the record linkage task in addition to the Fellegi-Sunter model and perform an empirical evaluation on artificial and real data in the context of iterative insertions. This evaluation provides a deeper insight into empirical similarities and differences between different modelling frames of the record linkage problem. In addition, the effects of using string comparators on the performance of different matching algorithms are evaluated. Results: Our central results show that stochastic record linkage based on the principle of the EM algorithm exhibits best classification results when calibrating data are structurally different to validation data. Bagging, boosting together with support vector machines are best classification methods when calibrating and validation data have no major structural differences. Conclusions: The most promising methodologies for record linkage in environments similar to the one considered in this paper seem to be stochastic ones.

Publisher

Georg Thieme Verlag KG

Subject

Health Information Management,Advanced and Specialized Nursing,Health Informatics

Link

http://www.thieme-connect.de/products/ejournals/pdf/10.3414/ME9238.pdf

Cited by 15 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. On the Concepts of Identity and Similarity in the Context of Biomedical Record Linkage;Studies in Health Technology and Informatics;2021-05-27

2. Linking health facility data from young adults aged 18-24 years to longitudinal demographic data: Experience from The Kilifi Health and Demographic Surveillance System;Wellcome Open Research;2020-02-27

3. Quo vadis Datenlinkage in Deutschland? Eine erste Bestandsaufnahme;Das Gesundheitswesen;2018-02-20

4. Linking health facility data from young adults aged 18-24 years to longitudinal demographic data: Experience from The Kilifi Health and Demographic Surveillance System;Wellcome Open Research;2017-07-17

5. Evaluation of a Binary Semi-supervised Classification Technique for Probabilistic Record Linkage;Methods of Information in Medicine;2016