SemEHR: A general-purpose semantic search system to surface semantic data from clinical notes for tailored care, trial recruitment, and clinical research*

Author:

Wu Honghan12,Toti Giulia3,Morley Katherine I34,Ibrahim Zina M15,Folarin Amos15,Jackson Richard1,Kartoglu Ismail6,Agrawal Asha7,Stringer Clive7,Gale Darren7,Gorrell Genevieve8,Roberts Angus8,Broadbent Matthew9,Stewart Robert910,Dobson Richard JB15

Affiliation:

1. Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK

2. School of Computer and Software, Nanjing University of Information Science and Technology, Nanjing, China

3. National Addiction Centre, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK

4. Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, University of Melbourne, Australia

5. Farr Institute of Health Informatics Research, University College London, London, UK

6. InterDigital Europe, London, UK

7. King’s College Hospital NHS Foundation Trust, London, UK

8. Department of Computer Science, University of Sheffield, Sheffield, UK

9. South London and Maudsley NHS Foundation Trust, London, UK

10. Psychological Medicine, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK

Abstract

Abstract Objective Unlocking the data contained within both structured and unstructured components of electronic health records (EHRs) has the potential to provide a step change in data available for secondary research use, generation of actionable medical insights, hospital management, and trial recruitment. To achieve this, we implemented SemEHR, an open source semantic search and analytics tool for EHRs. Methods SemEHR implements a generic information extraction (IE) and retrieval infrastructure by identifying contextualized mentions of a wide range of biomedical concepts within EHRs. Natural language processing annotations are further assembled at the patient level and extended with EHR-specific knowledge to generate a timeline for each patient. The semantic data are serviced via ontology-based search and analytics interfaces. Results SemEHR has been deployed at a number of UK hospitals, including the Clinical Record Interactive Search, an anonymized replica of the EHR of the UK South London and Maudsley National Health Service Foundation Trust, one of Europe’s largest providers of mental health services. In 2 Clinical Record Interactive Search–based studies, SemEHR achieved 93% (hepatitis C) and 99% (HIV) F-measure results in identifying true positive patients. At King’s College Hospital in London, as part of the CogStack program (github.com/cogstack), SemEHR is being used to recruit patients into the UK Department of Health 100 000 Genomes Project (genomicsengland.co.uk). The validation study suggests that the tool can validate previously recruited cases and is very fast at searching phenotypes; time for recruitment criteria checking was reduced from days to minutes. Validated on open intensive care EHR data, Medical Information Mart for Intensive Care III, the vital signs extracted by SemEHR can achieve around 97% accuracy. Conclusion Results from the multiple case studies demonstrate SemEHR’s efficiency: weeks or months of work can be done within hours or minutes in some cases. SemEHR provides a more comprehensive view of patients, bringing in more and unexpected insight compared to study-oriented bespoke IE systems. SemEHR is open source, available at https://github.com/CogStack/SemEHR.

Funder

Medical Research Council

Arthritis Research UK

British Heart Foundation

Cancer Research UK

Chief Scientist Office

Economic and Social Research Council

Engineering and Physical Sciences Research Council

National Institute for Social Care and Health Research

Publisher

Oxford University Press (OUP)

Subject

Health Informatics

Reference24 articles.

1. CUSTOM-SEQ: a prototype for oncology rapid learning in a comprehensive EHR environment;Warner;J Am Med Inform Assoc.,2016

2. Use of electronic health record data to evaluate overuse of cervical cancer screening;Mathias;J Am Med Inform Assoc.,2012

3. Predicting neutropenia risk in patients with cancer using electronic data;Pawloski;J Am Med Inform Assoc.,2017

4. Population cardiovascular health and urban environments: the Heart Healthy Hoods exploratory study in Madrid, Spain;Bilal;BMC Med Res Methodol.,2016

5. Application of clinical text data for phenome-wide association studies (PheWASs);Hebbring;Bioinformatics.,2015

Cited by 87 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3