Towards scaling Twitter for digital epidemiology of birth defects-Reference-Cited by-同舟云学术

Towards scaling Twitter for digital epidemiology of birth defects

Published:2019-10-01 Issue:1 Volume:2 Page:
ISSN:2398-6352
Container-title:npj Digital Medicine
language:en
Short-container-title:npj Digit. Med.

Author:

Klein Ari Z.^ORCID,Sarker Abeed,Weissenbacher Davy,Gonzalez-Hernandez Graciela

Abstract

Abstract Social media has recently been used to identify and study a small cohort of Twitter users whose pregnancies with birth defect outcomes—the leading cause of infant mortality—could be observed via their publicly available tweets. In this study, we exploit social media on a larger scale by developing natural language processing (NLP) methods to automatically detect, among thousands of users, a cohort of mothers reporting that their child has a birth defect. We used 22,999 annotated tweets to train and evaluate supervised machine learning algorithms—feature-engineered and deep learning-based classifiers—that automatically distinguish tweets referring to the user’s pregnancy outcome from tweets that merely mention birth defects. Because 90% of the tweets merely mention birth defects, we experimented with under-sampling and over-sampling approaches to address this class imbalance. An SVM classifier achieved the best performance for the two positive classes: an F1-score of 0.65 for the “defect” class and 0.51 for the “possible defect” class. We deployed the classifier on 20,457 unlabeled tweets that mention birth defects, which helped identify 542 additional users for potential inclusion in our cohort. Contributions of this study include (1) NLP methods for automatically detecting tweets by users reporting their birth defect outcomes, (2) findings that an SVM classifier can outperform a deep neural network-based classifier for highly imbalanced social media data, (3) evidence that automatic classification can be used to identify additional users for potential inclusion in our cohort, and (4) a publicly available corpus for training and evaluating supervised machine learning algorithms.

Funder

U.S. Department of Health & Human Services | NIH | U.S. National Library of Medicine

Publisher

Springer Science and Business Media LLC

Subject

Health Information Management,Health Informatics,Computer Science Applications,Medicine (miscellaneous)

Link

http://www.nature.com/articles/s41746-019-0170-5.pdf

Reference52 articles.

1. Mathews, T. J., MacDorman, M. F. & Thoma, M. E. Infant mortality statistics from the 2013 period linked birth/infant death data set. Natl Vital. Stat. Rep. 64, 2000–2013 (2015).

2. Blehar, M. C. et al. Enrolling pregnant women: issues in clinical research. Women's Health Issues 23, e39–e345 (2013).

3. Hartman, R. I. & Kimball, A. B. Performing research in pregnancy: challenges and perspectives. Clin. Dermatol. 34, 410–415 (2016).

4. Ward, R. M. Difficulties in the study of adverse fetal and neonatal effects of drug therapy during pregnancy. Semin. Perinatol. 25, 191–195 (2001).

5. Kennedy, D. L., Uhl, K. & Kweder, S. L. Pregnancy exposure registries. Drug Saf. 27, 215–228 (2004).

Cited by 15 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Social media and COVID‐19 vaccination hesitancy during pregnancy: a mixed methods analysis;BJOG: An International Journal of Obstetrics & Gynaecology;2023-04-20

2. Approaches to Assessing the Safety of Medicines during the COVID-19 Pandemic Using the Example of Azithromycin;Safety and Risk of Pharmacotherapy;2022-10-05

3. Using Twitter Data for Cohort Studies of Drug Safety in Pregnancy: Proof-of-concept With β-Blockers;JMIR Formative Research;2022-06-30

4. Using Twitter Data for Cohort Studies of Drug Safety in Pregnancy: A Proof-of-Concept with Beta-Blockers;2022-03-03

5. Using Twitter Data for Cohort Studies of Drug Safety in Pregnancy: Proof-of-concept With β-Blockers (Preprint);2022-01-24