Construction of a Breast Cancer Diagnosis Knowledge Graph Based on Chinese Electronic Medical Records: Development and Usability Study (Preprint)

Author:

Li XiaolongORCID,Sun Shuifa,Tang Tinglong,Lu Ji,Zhang Lijuan,Yin Jie,Geng Qian,Wu Yirong

Abstract

BACKGROUND

Breast cancer is one of the most common malignant tumors in women, severely threatening the health of women worldwide. Designing an effective data management and processing system to help collect, manage, and use variables for breast cancer diagnosis and treatment has become an urgent need. As an important part of artificial intelligence, a knowledge graph provides an ideal means to solve this problem.

OBJECTIVE

Our study intends to utilize the natural language processing (NLP) technique on Chinese breast cancer mammography reports to effectively identify and extract the features related to breast cancer and construct a knowledge graph for breast cancer diagnosis.

METHODS

This paper focuses on the knowledge graph frame structure and feature extraction that were the main challenges for constructing a Chinese breast cancer diagnosis knowledge graph. Based on mammography examination guidelines and specifications, as well as clinical experiences and recommendations of experts in the hospital, we define entities, entity attributes, and entity relationships for constructing the concept layer of a knowledge graph. From mammography examination reports, we extract mammographic features using deep learning models, with which we build a knowledge graph for breast cancer diagnosis.

RESULTS

When annotating mammography examination reports in NLP tasks, we have identified 15 important types of mammographic features. To improve the versatility of the constructed knowledge graph, we have added additional 7 types of mammographic features. Mammographic features are extracted from a total of 1171 mammography examination reports. For the overall results of the model, the recognition accuracy rate is 98.97%, the accuracy rate is 97.16%, the recall rate is 98.06%, and F1 is 97.61. Based on the structure of the concept layer of the knowledge graph, we import the demographic risk factors and mammographic features extracted from the text reports into the Neo4j graph database to complete the construction of the knowledge graph.

CONCLUSIONS

We constructed a Breast Cancer Diagnosis Knowledge Graph Based on Chinese Electronic Medical Records. Through the evaluation of the design of the concept layer, the construction of the data layer, and the functions of the application layer, the rationality, effectiveness, and practicability of the knowledge graph are demonstrated. This study provides a reference for the rapid design and construction of knowledge graph for other disease diagnosis and treatment.

Publisher

JMIR Publications Inc.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3