Scalable Community Extraction of Text Networks for Automated Grouping in Medical Databases-Reference-Cited by-同舟云学术

Scalable Community Extraction of Text Networks for Automated Grouping in Medical Databases

Published:2022 Issue: Volume: Page:1-20
ISSN:1680-743X
Container-title:Journal of Data Science
language:en
Short-container-title:

Author:

Komolafe Tomilayo,Fong Allan,Sengupta Srijan^ORCID

Abstract

Networks are ubiquitous in today’s world. Community structure is a well-known feature of many empirical networks, and a lot of statistical methods have been developed for community detection. In this paper, we consider the problem of community extraction in text networks, which is greatly relevant in medical errors and patient safety databases. We adapt a well-known community extraction method to develop a scalable algorithm for extracting groups of similar documents in large text databases. The application of our method on a real-world patient safety report system demonstrates that the groups generated from community extraction are much more accurate than manual tagging by frontline workers.

Publisher

School of Statistics, Renmin University of China

Subject

Industrial and Manufacturing Engineering

Reference45 articles.

1. An information-theoretic perspective of tf–idf measures;Information Processing & Management,2003

2. Pseudo-likelihood methods for community detection in large sparse networks;Ann. Statist.,2013

3. A nonparametric view of network models and Newman–Girvan and other modularities;Proceedings of the National Academy of Sciences,2009

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Automated Error Labeling in Radiation Oncology via Statistical Natural Language Processing;Diagnostics;2023-03-23

2. Editorial: Advances in Network Data Science;Journal of Data Science;2023