Using WeChat clinician-patient group communication data to identify symptom burdens in patients with uterine fibroids under focused ultrasound ablation surgery treatment (Preprint)

Author:

Zhang Jiayuan,Xu Wei,Lei Cheng,Pu Yang,Zhang Yubo,Zhang Jingyu,Yu Hongfan,Su Xueyao,Huang Yanyan,Gong Ruoyan,Zhang Lijun,Shi QiulingORCID

Abstract

BACKGROUND

Unlike research project-based health data collections, such as questionnaires, interviews, and social media platforms, which allow patients to freely discuss their health status and obtain peer support, previous literature has pointed out that both public-facing websites and private Facebook can serve as data sources for patient-reported outcomes.

OBJECTIVE

This study aimed to use natural language processing (NLP) techniques based on machine learning to identify concerns regarding the postoperative quality of life and symptom burdens in uterine fibroids after focused ultrasound ablation surgery.

METHODS

Screenshots taken from the clinician-patient WeChat groups were converted into free texts using image text recognition technology and used as the research object of this study, which used regular expressions in Python to search for symptom burdens in over 900,000 words of WeChat group chats associated with 408 patients in Chongqing Haifu Hospital diagnosed with uterine fibroids between 2010 and 2020. We first built a corpus of symptoms by manually coding 30% of the WeChat texts, and then used regular expressions to crawl symptom information from the remaining texts based on this corpus. We compared the results with a manual review (gold standard) of the same records. The mixed method was used to access the relationship between the population baseline data and conceptual symptoms, Quantitative and qualitative results were examined

RESULTS

A total of 190,000 words of uterine fibroids patients' free text were finally obtained after data cleaning. A total of 408 patients were included in the study. The age of the patients was 39.94±6.81 years, and their BMI was 23.47±29.37 (kg/m^2). The median reporting times of the seven major symptoms were 21, 26, 57, 2, 18, 30, and 49 days. Results showed that patients with dysmenorrhea were younger and slimmer (mean (SD), P<.05), with lower fertility and parity (P<.05), and tended to stay longer in the hospital (P<.05). Logistic regression models identified menstrual duration (odds ratios (OR) (95%CI)), age at menarche (OR (95%CI)), reported symptoms before surgery (OR (95%CI)), and the number and size of fibroids as significant risk factors for postoperative symptoms.

CONCLUSIONS

Unstructured free texts from social media platforms extracted by NLP technology can be used for analysis, to capture the conceptual information about patients' HRQol, screen out high-risk groups, and track the reporting time of certain symptoms, adopt personalized treatment for patients at different stages of recovery to improve the quality of life of patients. Python-based text mining of free-text data can accurately extract symptom burden administered and save considerable time compared to manual review, maximizing the utility of the extant information in population-based electronic health records for comparative effectiveness research.

Publisher

JMIR Publications Inc.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3