Implementation of a Cohort Retrieval System for Clinical Data Repositories Using the Observational Medical Outcomes Partnership Common Data Model: Proof-of-Concept System Validation

Author:

Liu SijiaORCID,Wang YanshanORCID,Wen AndrewORCID,Wang LiweiORCID,Hong NaORCID,Shen FeichenORCID,Bedrick StevenORCID,Hersh WilliamORCID,Liu HongfangORCID

Abstract

Background Widespread adoption of electronic health records has enabled the secondary use of electronic health record data for clinical research and health care delivery. Natural language processing techniques have shown promise in their capability to extract the information embedded in unstructured clinical data, and information retrieval techniques provide flexible and scalable solutions that can augment natural language processing systems for retrieving and ranking relevant records. Objective In this paper, we present the implementation of a cohort retrieval system that can execute textual cohort selection queries on both structured data and unstructured text—Cohort Retrieval Enhanced by Analysis of Text from Electronic Health Records (CREATE). Methods CREATE is a proof-of-concept system that leverages a combination of structured queries and information retrieval techniques on natural language processing results to improve cohort retrieval performance using the Observational Medical Outcomes Partnership Common Data Model to enhance model portability. The natural language processing component was used to extract common data model concepts from textual queries. We designed a hierarchical index to support the common data model concept search utilizing information retrieval techniques and frameworks. Results Our case study on 5 cohort identification queries, evaluated using the precision at 5 information retrieval metric at both the patient-level and document-level, demonstrates that CREATE achieves a mean precision at 5 of 0.90, which outperforms systems using only structured data or only unstructured text with mean precision at 5 values of 0.54 and 0.74, respectively. Conclusions The implementation and evaluation of Mayo Clinic Biobank data demonstrated that CREATE outperforms cohort retrieval systems that only use one of either structured data or unstructured text in complex textual cohort queries.

Publisher

JMIR Publications Inc.

Subject

Health Information Management,Health Informatics

Reference45 articles.

1. Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research

2. Accrual to Clinical Trials (ACT) NetworkClinical and Translational Science Institute2020-08-20https://www.ctsi.umn.edu/consultations-and-services/multi-site-study-support/accrual-clinical-trials-act-network

3. The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future

4. PCORnet: the National Patient-Centered Clinical Research Network2020-08-20https://pcornet.org/clinical-research-network/

5. Clinical Data Reuse or Secondary Use: Current Status and Potential Future Progress

Cited by 18 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Clinical Data Warehousing: A Scoping Review;Journal of the Society for Clinical Data Management;2024-08-28

2. Kamino: A Scalable Architecture to Support Medical AI Research Using Large Real World Data;2024 IEEE 12th International Conference on Healthcare Informatics (ICHI);2024-06-03

3. Clinical Information Retrieval: A Literature Review;Journal of Healthcare Informatics Research;2024-01-23

4. NLP Applications—Clinical Documents;Cognitive Informatics in Biomedicine and Healthcare;2024

5. The IMPACT framework and implementation for accessible in silico clinical phenotyping in the digital era;npj Digital Medicine;2023-07-21

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3