Transferability Based on Drug Structure Similarity in the Automatic Classification of Noncompliant Drug Use on Social Media: Natural Language Processing Approach-Reference-Cited by-同舟云学术

Transferability Based on Drug Structure Similarity in the Automatic Classification of Noncompliant Drug Use on Social Media: Natural Language Processing Approach

Published:2023-05-03 Issue: Volume:25 Page:e44870
ISSN:1438-8871
Container-title:Journal of Medical Internet Research
language:en
Short-container-title:J Med Internet Res

Author:

Nishiyama Tomohiro^ORCID,Yada Shuntaro^ORCID,Wakamiya Shoko^ORCID,Hori Satoko^ORCID,Aramaki Eiji^ORCID

Abstract

Background Medication noncompliance is a critical issue because of the increased number of drugs sold on the web. Web-based drug distribution is difficult to control, causing problems such as drug noncompliance and abuse. The existing medication compliance surveys lack completeness because it is impossible to cover patients who do not go to the hospital or provide accurate information to their doctors, so a social media–based approach is being explored to collect information about drug use. Social media data, which includes information on drug usage by users, can be used to detect drug abuse and medication compliance in patients. Objective This study aimed to assess how the structural similarity of drugs affects the efficiency of machine learning models for text classification of drug noncompliance. Methods This study analyzed 22,022 tweets about 20 different drugs. The tweets were labeled as either noncompliant use or mention, noncompliant sales, general use, or general mention. The study compares 2 methods for training machine learning models for text classification: single-sub-corpus transfer learning, in which a model is trained on tweets about a single drug and then tested on tweets about other drugs, and multi-sub-corpus incremental learning, in which models are trained on tweets about drugs in order of their structural similarity. The performance of a machine learning model trained on a single subcorpus (a data set of tweets about a specific category of drugs) was compared to the performance of a model trained on multiple subcorpora (data sets of tweets about multiple categories of drugs). Results The results showed that the performance of the model trained on a single subcorpus varied depending on the specific drug used for training. The Tanimoto similarity (a measure of the structural similarity between compounds) was weakly correlated with the classification results. The model trained by transfer learning a corpus of drugs with close structural similarity performed better than the model trained by randomly adding a subcorpus when the number of subcorpora was small. Conclusions The results suggest that structural similarity improves the classification performance of messages about unknown drugs if the drugs in the training corpus are few. On the other hand, this indicates that there is little need to consider the influence of the Tanimoto structural similarity if a sufficient variety of drugs are ensured.

Publisher

JMIR Publications Inc.

Subject

Health Informatics

Reference39 articles.

1. Health literacy and adherence to medical treatment in chronic and acute illness: A meta-analysis

2. Online Pharmacies Selling Prescription Drugs: Systematic Review

3. Dealing with Medication Non-Adherence Expressions in Twitter

4. Using Social Media Data in Routine Pharmacovigilance: A Pilot Study to Identify Safety Signals and Patient Perspectives

5. Understanding Medication Nonadherence from Social Media: A Sentiment-Enriched Deep Learning Approach

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Clinical Text Analysis with Natural Language Processing: A BERT-based Approach;2024 International Conference on Communication, Computer Sciences and Engineering (IC3SE);2024-05-09

2. Research and Application of Digital Media Object Classification Method Based on Large Interval Distribution Learning;2023 3rd International Conference on Smart Generation Computing, Communication and Networking (SMART GENCON);2023-12-29