Abstract
ABSTRACTObjectiveThe manual extraction of case details from patient records for cancer surveillance efforts is a resource-intensive task. Natural Language Processing (NLP) techniques have been proposed for automating the identification of key details in clinical notes. Our goal was to develop NLP application programming interfaces (APIs) for integration into cancer registry data abstraction tools in a computer-assisted abstraction setting.MethodsWe used cancer registry manual abstraction processes to guide the design of DeepPhe-CR, a web-based NLP service API. The coding of key variables was done through NLP methods validated using established workflows. A container-based implementation including the NLP wasdeveloped. Existing registry data abstraction software was modified to include results from DeepPhe-CR. An initial usability study with data registrars provided early validation of the feasibility of the DeepPhe-CR tools.ResultsAPI calls support submission of single documents and summarization of cases across multiple documents. The container-based implementation uses a REST router to handle requests and support a graph database for storing results. NLP modules extract topography, histology, behavior, laterality, and grade at 0.79-1.00 F1 across common and rare cancer types (breast, prostate, lung, colorectal, ovary and pediatric brain) on data from two cancer registries. Usability study participants were able to use the tool effectively and expressed interest in adopting the tool.DiscussionOur DeepPhe-CR system provides a flexible architecture for building cancer-specific NLP tools directly into registrar workflows in a computer-assisted abstraction setting. Improving user interactions in client tools, may be needed to realize the potential of these approaches. DeepPhe-CR:https://deepphe.github.io/.
Publisher
Cold Spring Harbor Laboratory
Reference21 articles.
1. Automatic extraction of cancer registry reportable information from free-text pathology reports using multitask convolutional neural networks;J Am Med Inform Assoc JAMIA,2019
2. Automatic information extraction from childhood cancer pathology reports;JAMIA Open,2022
3. Alawad M , Gao S , Qiu J , Schaefferkoetter N , Hinkle JD , Yoon HJ , et al. Deep Transfer Learning Across Cancer Registries for Information Extraction from Pathology Reports. In: 2019 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI). 2019. p. 1–4.
4. Assigning ICD-O-3 codes to pathology reports using neural multi-task training with hierarchical regularization
5. Yoon HJ , Klasky HB , Gounley JP , Alawad M , Gao S , Durbin EB , et al. Accelerated training of bootstrap aggregation-based deep information extraction systems from cancer pathology reports. J Biomed Inform. 2020 Oct 1;110:103564.