Affiliation:
1. Memorial Sloan Kettering Cancer Center, New York, NY
2. Weill Cornell Medical College, New York, NY
Abstract
PURPOSE Electronic medical records (EMRs) are a vast resource of potentially mineable data that can be used to complement and extend clinical trials. Extracting and analyzing EMR data are impeded by technical complexities associated with large, multiformat databases. We sought to develop and validate a framework that would overcome the difficulties associated with EMR data and create a simple, portable, and expandable system to better use this resource. MATERIALS AND METHODS An Internet-accessible program was developed in Python that applied user-defined criteria to identify and extract patient data from Memorial Sloan Kettering databases. A Worker Application composed of individual modules was developed to identify each patient’s functional status, smoking status, and treatment classification. The validity of this approach was tested by identifying, extracting, and analyzing data from a patient cohort that paralleled a practice-changing, prospective, randomized phase III clinical trial performed at a different institution. We called this a synthetic clinical trial. RESULTS Our synthetic clinical trial identified and extracted data on a cohort of 281 patients with lung cancer who matched inclusion criteria and received their first treatment between October 2003 and July 2010. The data extraction modules were precise and accurate, with F-measures greater than 0.98. Results were similar in directionality and magnitude to the chosen comparator clinical trial. CONCLUSION Our framework offers an accurate and user-friendly interface for identifying and extracting EMR data that can be used to create synthetic clinical trials. Additional studies are needed to validate this approach in other patient cohorts, replicate our findings, and leverage this methodology to improve patient care and accelerate drug development.
Publisher
American Society of Clinical Oncology (ASCO)
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献