Mining FDA drug labels using an unsupervised learning technique - topic modeling-Reference-Cited by-同舟云学术

Mining FDA drug labels using an unsupervised learning technique - topic modeling

Published:2011-10-18 Issue:S10 Volume:12 Page:
ISSN:1471-2105
Container-title:BMC Bioinformatics
language:en
Short-container-title:BMC Bioinformatics

Author:

Bisgin Halil,Liu Zhichao,Fang Hong,Xu Xiaowei,Tong Weida

Abstract

Abstract Background The Food and Drug Administration (FDA) approved drug labels contain a broad array of information, ranging from adverse drug reactions (ADRs) to drug efficacy, risk-benefit consideration, and more. However, the labeling language used to describe these information is free text often containing ambiguous semantic descriptions, which poses a great challenge in retrieving useful information from the labeling text in a consistent and accurate fashion for comparative analysis across drugs. Consequently, this task has largely relied on the manual reading of the full text by experts, which is time consuming and labor intensive. Method In this study, a novel text mining method with unsupervised learning in nature, called topic modeling, was applied to the drug labeling with a goal of discovering “topics” that group drugs with similar safety concerns and/or therapeutic uses together. A total of 794 FDA-approved drug labels were used in this study. First, the three labeling sections (i.e., Boxed Warning, Warnings and Precautions, Adverse Reactions) of each drug label were processed by the Medical Dictionary for Regulatory Activities (MedDRA) to convert the free text of each label to the standard ADR terms. Next, the topic modeling approach with latent Dirichlet allocation (LDA) was applied to generate 100 topics, each associated with a set of drugs grouped together based on the probability analysis. Lastly, the efficacy of the topic modeling was evaluated based on known information about the therapeutic uses and safety data of drugs. Results The results demonstrate that drugs grouped by topics are associated with the same safety concerns and/or therapeutic uses with statistical significance (P<0.05). The identified topics have distinct context that can be directly linked to specific adverse events (e.g., liver injury or kidney injury) or therapeutic application (e.g., antiinfectives for systemic use). We were also able to identify potential adverse events that might arise from specific medications via topics. Conclusions The successful application of topic modeling on the FDA drug labeling demonstrates its potential utility as a hypothesis generation means to infer hidden relationships of concepts such as, in this study, drug safety and therapeutic use in the study of biomedical documents.

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology

Link

https://link.springer.com/content/pdf/10.1186/1471-2105-12-S10-S11.pdf

Reference29 articles.

1. Baeza-Yates R, Ribeiro-Neto. B: Modern Information Retrieval. New York: ACM Press; 1999.

2. Swanson DR: Fish oil, Raynaud's syndrome, and undiscovered public knowledge. Perspectives in Biology and Medicine 1986, 30(1):7–18.

3. Salton G, McGill MJ: Introduction to Modern Information Retrieval. McGraw-Hill; 1983.

4. Gordon MD, Lindsay RK: Toward discovery support systems: a replication, re-examination, and extension of Swanson's work on literature-based discovery of a connection between Raynaud's and fish oil. J Am Soc Inf Sci 1996, 47(2):116–128. 10.1002/(SICI)1097-4571(199602)47:2<116::AID-ASI3>3.0.CO;2-1

5. Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R: Indexing by latent semantic analysis. J Am Soc Inf Sci 1990, 41(6):391–407. 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9

Cited by 82 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Artificial intelligence in peri‐operative prediction model research: are we there yet?;Anaesthesia;2024-05-15

2. Exploring public voice on social media: Twitter Users' views on the circular economy;Sustainable Development;2024-05-08

3. Cancer hallmark analysis using semantic classification with enhanced topic modelling on biomedical literature;Multimedia Tools and Applications;2024-02-17

4. Feasibility of artificial intelligence its current status, clinical applications, and future direction in cardiovascular disease;Current Problems in Cardiology;2024-02

5. Topic Integrated Opinion-Based Drug Recommendation With Transformers;IEEE Transactions on Emerging Topics in Computational Intelligence;2023-12