Author:
Ge Xiaoling,Wang Yi,Xie Li,Shang Yujuan,Zhai Yihui,Huang Zhiheng,Huang Jianfeng,Ye Chengjie,Ma Ao,Li Wanting,Zhang Xiaobo,Xu Hong
Abstract
AbstractBackgroundArtificial intelligence (AI)-assisted diagnosis is considered to be the future direction of improving the efficiency and accuracy of pediatric diseases diagnosis, while the existing research based on AI are far from sufficient because of limited data amount, inadequate coverage of disease types, or high construction costs, and have not been applied on a large scale. We aimed to develop an accurate deep learning model trained on millions of real-world data to verify the feasibility of the technology, and build the whole process of outpatient auxiliary diagnosis.Methods and findingsWe applied a Chinese Natural Language Processing (NLP) and an end-to-end deep neural network classifier to the outpatient’s electronic medical records (EMRs) in a single child care center in Shanghai, China, to unstructured text processing and construct an auxiliary diagnostic model, all patients were aged from 0 to 18 years. A training cohort with millions of records and an independent validation cohort with tens of thousands of records were intake separately and calculate diagnosis concordance rate (DCR) of model in each diseases group. The records with inconsistent diagnoses between human and AI were evaluated by clinical experts’ group, and calculate the relative correct rate (RCR) to evaluate the diagnostic performance of the model. A total of 5,271,347 medical records were intake in model training covering sixteen categories of diseases according to disease coding, reaching a DCR of 95· 49% (95· 48∼95· 51). For validation, 91,880 records were obtained from validation dataset, which reached a DCR of 93· 51% (93· 35∼93· 67) and FDCR of 72.04% (71· 75∼72· 33). It was confirmed that the accuracy of the model was still higher than that of human with most RCR>1 in validation dataset.ConclusionsThe deep learning system could support diagnosis of pediatric diseases, which has high diagnostic performance, comprehensive disease coverage, feasible technology, and can be promoted in multiple sites in the future.FundingThe Authors received no specific funding for this work.
Publisher
Cold Spring Harbor Laboratory