Abstract
Background
Electronic nicotine delivery system (ENDS) brands, such as JUUL, used social media as a key component of their marketing strategy, which led to massive sales growth from 2015 to 2018. During this time, ENDS use rapidly increased among youths and young adults, with flavored products being particularly popular among these groups.
Objective
The aim of our study is to develop a named entity recognition (NER) model to identify potential emerging vaping brands and flavors from Instagram post text. NER is a natural language processing task for identifying specific types of words (entities) in text based on the characteristics of the entity and surrounding words.
Methods
NER models were trained on a labeled data set of 2272 Instagram posts coded for ENDS brands and flavors. We compared three types of NER models—conditional random fields, a residual convolutional neural network, and a fine-tuned distilled bidirectional encoder representations from transformers (FTDB) network—to identify brands and flavors in Instagram posts with key model outcomes of precision, recall, and F1 scores. We used data from Nielsen scanner sales and Wikipedia to create benchmark dictionaries to determine whether brands from established ENDS brand and flavor lists were mentioned in the Instagram posts in our sample. To prevent overfitting, we performed 5-fold cross-validation and reported the mean and SD of the model validation metrics across the folds.
Results
For brands, the residual convolutional neural network exhibited the highest mean precision (0.797, SD 0.084), and the FTDB exhibited the highest mean recall (0.869, SD 0.103). For flavors, the FTDB exhibited both the highest mean precision (0.860, SD 0.055) and recall (0.801, SD 0.091). All NER models outperformed the benchmark brand and flavor dictionary look-ups on mean precision, recall, and F1. Comparing between the benchmark brand lists, the larger Wikipedia list outperformed the Nielsen list in both precision and recall.
Conclusions
Our findings suggest that NER models correctly identified ENDS brands and flavors in Instagram posts at rates competitive with, or better than, others in the published literature. Brands identified during manual annotation showed little overlap with those in Nielsen scanner data, suggesting that NER models may capture emerging brands with limited sales and distribution. NER models address the challenges of manual brand identification and can be used to support future infodemiology and infoveillance studies. Brands identified on social media should be cross-validated with Nielsen and other data sources to differentiate emerging brands that have become established from those with limited sales and distribution.
Reference67 articles.
1. New investigation exposes how tobacco companies market cigarettes on social media in the U.S. and around the worldCampaign for Tobacco-Free Kids20182021-12-17https://www.tobaccofreekids.org/press-releases/2018_08_27_ftc
2. Youth self-reported exposure to and perceptions of vaping advertisements: Findings from the 2017 International Tobacco Control Youth Tobacco and Vaping Survey
3. MyersMMuggliMHeniganDRequest for investigative and enforcement action to stop deceptive advertising onlineTobacco Free Kids20182021-12-17https://www.tobaccofreekids.org/assets/content/press_office/2018/2018_08_ftc_petition.pdf
4. JacklerRChauCGetachewBWhitcombMLee-HeidenreichJBhattAKim-O’SullivanSHoffmanZJacklerLRamamurthiDJUUL advertising over its first three years on the marketStanford Research into the Impact of Tobacco Advertising20192021-12-17https://tobacco-img.stanford.edu/wp-content/uploads/2021/07/21231836/JUUL_Marketing_Stanford.pdf
5. CraverRJuul expands top U.S. e-cig market share; traditional cigarettes volume continues to slipWinston–Salem Journal20182021-12-18https://journalnow.com/business/juul-expands-top-u-s-e-cig-market-share-traditional/article_9bdfd55c-68b5-5c08-aeb8-edb4a616ca9e.html
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献