Author:
Junior Augusto Afonso Guerra,Pereira Ramon Gonçalves,Gurgel Eli Iola,Cherchiglia Mariangela,Dias Leonardo Vinicius,Ávila Juliano,Santos Núbia,Reis Afonso,Acurcio Francisco Assis,Junior Wagner Meira
Abstract
IntroductionIn Brazil, the National Health System (SUS) provides healthcare to the public. The system hasmultiple administrative databases; the major databases record hospital (SIH) and outpatient (SIA)procedures. Epidemiological information is collected for all populations in subsystems, such as mor-tality (SIM), live births (SINASC) and diseases of compulsory declaration (SINAN). Each subsystemhas its own information system, which is able to provide information about consultations, clinicalinformation and medicines dispensed. However, these systems are not linked, thereby preventingindividual-centred analysis.
ObjectiveTo describe the methods and results of parameter setting that are needed to execute the probabilisticdeduplication of large administrative and epidemiological databases in Brazil and to create a NationalHealth Database Centred on the individual.
MethodsThis paper shows the results of a record linkage model to integrate data from SIH, SIA, SIM, andSINAN, which have different formats and attributes between them and over time. These data consistof 1.3 billion records from 2000-2015. Probabilistic and deterministic record linkages were used todeduplicate these data. The Kappa statistic and clerical review were used to ensure the quality ofthe linkage. The graph algorithm and depth-first search were used to generate the identifiers.
ResultsThe deterministic deduplication process resulted in a database with 403,113,527 possible uniqueindividuals. After the probabilistic deduplication process of the former database was performed,159,703,805 unique individuals were identified. This result had an estimated a false positive errorrate of 3.3%, and the false negative error was estimated at 12.3%.
ConclusionsThe National Health Database centred on the individual was generated and will allow researchersto use real-world evidence to conduct clinical, epidemiological, economic and other studies. Thisdatabase represents a significant cohort, spanning 15 years of historical data and preserving patientprivacy. The success of the process described will allow repeating and appending the data for futureyears and enable important studies to promote SUS efficiency and provide better treatments forpatients.
KeywordsData linkage, record linkage, Brazilian health database, SUS deduplication
Subject
Information Systems and Management,Health Informatics,Information Systems,Demography
Cited by
32 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献