Abstract
AbstractMedical narratives document a vast amount of clinical data. This data has a valuable secondary purpose, as it may be used to optimize health service delivery and improve the quality of medical care. However, medical narratives are typically recorded in an unstructured manner, which complicates the process of extracting the structured information required for optimization. In this paper, we address this problem by applying and comparing two models, a rule-based model and a model based on conditional random fields (CRFs), to a data set of Chinese medical narratives. Among 4626 manually annotated Chinese medical narratives, collected from Shanxi Dayi Hospital in China, the rule-based model achieved 95.87% precision, 69.82% recall, and an F-score of 80.80%, and the CRF-based model realized 95.99% precision, 65.11% recall, and a 77.59% F-score. These experimental results demonstrate the efficacy of both proposed models for structural extraction from Chinese medical narratives.
Publisher
Cold Spring Harbor Laboratory