Author:
Yokoi H.,Fujita S.,Takabayashi K.,Suzuki T.
Abstract
Summary
Objectives:
We extracted index terms related to diseases recorded in hospital discharge summaries and examined the capability of the vector space model to select a suitable diagnosis with these terms.
Methods:
By morphological analysis, we extracted index terms and constructed an original dictionary for the discharge summary analysis. We chose 125 different DPC (Japanese DRG system) codes for the diseases, each of which had more than 20 cases. We divided them into two groups. One group consisted of 5927 cases from 2004 fiscal year and was used to generate the document vector space according to the DPC. The other group of 3187 cases was collected to verify the automatic DPC selection by using data from 2005 fiscal year. The top 200 extracted index terms for each disease were used to calculate the weight of each disease.
Results:
The DPC code obtained by the calculated similarity was compared with the original codes of patients for 125 DPCs of 3187 cases. Eighty percent of the cases matched the diagnosis of the DPC (first six digits) and 56% of the cases completely matched all 14 digits of the DPC.
Conclusions:
We demonstrated that we could extract suitable terms for each disease and obtain characteristics, such as the diagnosis, from the calculated vectors. This technique can be used to measure the qualification of discharge summaries and to integrate discharge summaries among different facilities. By the text mining technique, we can characterize the contents of electronic discharge summaries and deduce diagnoses with the data.
Subject
Health Information Management,Advanced and Specialized Nursing,Health Informatics
Cited by
19 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献