Comparison of Population Characteristics in Real-World Clinical Oncology Databases in the US: Flatiron Health, SEER, and NPCR-Reference-Cited by-同舟云学术

Comparison of Population Characteristics in Real-World Clinical Oncology Databases in the US: Flatiron Health, SEER, and NPCR

Published:2020-03-18 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Ma Xinran,Long Lura,Moon Sharon,Adamson Blythe J.S.,Baxi Shrujal S.

Abstract

ABSTRACTBackground and ObjectiveThe Surveillance, Epidemiology, and End Results Program (SEER) program and the National Program of Cancer Registries (NPCR), are authoritative sources for population cancer surveillance and research in the US. An increasing number of recent oncology studies are based on the electronic health record (EHR)-derived de-identified databases created and maintained by Flatiron Health. This report describes the differences in the originating sources and data development processes, and compares baseline demographic characteristics in the cancer-specific databases from Flatiron Health, SEER, and NPCR, to facilitate interpretation of research findings based on these sources.MethodsPatients with documented care from January 1, 2011 through May 31, 2019 in a series of EHR-derived Flatiron Health de-identified databases covering multiple tumor types were included. SEER incidence data (obtained from the SEER 18 database) and NPCR incidence data (obtained from the US Cancer Statistics public use database) for malignant cases diagnosed from January 1, 2011 to December 31, 2016 were included. Comparisons of demographic variables were performed across all disease-specific databases, for all patients and for the subset diagnosed with advanced-stage disease.ResultsAs of May 2019, a total of 201,570 patients with 19 different cancer types were included in Flatiron Health datasets. In an overall comparison to national cancer registries, patients in the Flatiron Health databases had similar sex and geographic distributions, but appeared to be diagnosed with later stages of disease and their age distribution differs from the other datasets. For variables such as stage and race, Flatiron Health databases had a greater degree of incompleteness. There are variations in these trends by cancer types.ConclusionsThese three databases present general similarities in demographic and geographic distribution, but there are overarching differences across the populations they cover. Differences in data sourcing (medical oncology EHRs vs cancer registries), and disparities in sampling approaches and rules of data acquisition may explain some of these divergences. Furthermore, unlike the steady information flow entered into registries, the availability of medical oncology EHR-derived information reflects the extent of involvement of medical oncology clinics at different points in the specialty management of individual diseases, resulting in inter-disease variability. These differences should be considered when interpreting study results obtained with these databases.

Publisher

Cold Spring Harbor Laboratory

Reference35 articles.

1. US Food and Drug Administration (b). Framework for FDA’s real-world evidence program. December 2018. Accessed at https://www.fda.gov/media/120060/download on December 23, 2019

2. Data rich, information poor: Can we use electronic health records to create a learning healthcare system for pharmaceuticals?;Clin Pharmacol Ther,2019

3. National Cancer Institute. Surveillance, Epidemiology, and End Results Program. Overview of the SEER Program. Accessed at https://seer.cancer.gov/about/overview.html on February 18, 2020

4. Center for Disease Control and Prevention. National Program of Cancer Registries. Accessed at https://www.cdc.gov/cancer/npcr/index.htm on February 18, 2020

5. Health Information Technology (HITECH Act). Accessed at https://www.healthit.gov/sites/default/files/hitech_act_excerpt_from_arra_with_index.pdf on February 18 2020

Cited by 160 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Variation in telemedicine usage in gynecologic cancer: Are we widening or narrowing disparities?;Gynecologic Oncology;2024-05

2. A federated learning system for precision oncology in Europe: DigiONE;Nature Medicine;2024-01-09

3. Sociodemographic associations with uptake of novel therapies for acute myeloid leukemia;Blood Cancer Journal;2023-12-21

4. Matching by OS Prognostic Score to Construct External Controls in Lung Cancer Clinical Trials;Clinical Pharmacology & Therapeutics;2023-11-30

5. Association between treatment and improvements in overall survival of patients with advanced/metastatic non–small cell lung cancer since 2011: A study in the United States, Canada, and Germany using retrospective real‐world databases;Cancer;2023-11-07