Build neural network models to identify and correct news headlines exaggerating obesity-related scientific findings
Author:
An Ruopeng1ORCID, Batcheller Quinlan1, Wang Junjie2, Yang Yuyi1
Affiliation:
1. 1 Brown School, Washington University in St. Louis , One Brookings Drive, St. Louis , Missouri , United States 2. 2 Department of kinesiology and health promotion, Dalian University of Technology , No.2 Linggong Road , Dalian , China
Abstract
Abstract
Purpose
Media exaggerations of health research may confuse readers’ understanding, erode public trust in science and medicine, and cause disease mismanagement. This study built artificial intelligence (AI) models to automatically identify and correct news headlines exaggerating obesity-related research findings.
Design/methodology/approach
We searched popular digital media outlets to collect 523 headlines exaggerating obesity-related research findings. The reasons for exaggerations include: inferring causality from observational studies, inferring human outcomes from animal research, inferring distant/end outcomes (e.g., obesity) from immediate/intermediate outcomes (e.g., calorie intake), and generalizing findings to the population from a subgroup or convenience sample. Each headline was paired with the title and abstract of the peer-reviewed journal publication covered by the news article. We drafted an exaggeration-free counterpart for each original headline and fined-tuned a BERT model to differentiate between them. We further fine-tuned three generative language models—BART, PEGASUS, and T5 to autogenerate exaggeration-free headlines based on a journal publication’s title and abstract. Model performance was evaluated using the ROUGE metrics by comparing model-generated headlines with journal publication titles.
Findings
The fine-tuned BERT model achieved 92.5% accuracy in differentiating between exaggeration-free and original headlines. Baseline ROUGE scores averaged 0.311 for ROUGE-1, 0.113 for ROUGE-2, 0.253 for ROUGE-L, and 0.253 ROUGE-Lsum. PEGASUS, T5, and BART all outperformed the baseline. The best-performing BART model attained 0.447 for ROUGE-1, 0.221 for ROUGE-2, 0.402 for ROUGE-L, and 0.402 for ROUGE-Lsum.
Originality/value
This study demonstrated the feasibility of leveraging AI to automatically identify and correct news headlines exaggerating obesity-related research findings.
Publisher
Walter de Gruyter GmbH
Reference23 articles.
1. An, R. P., Shen, J., & Xiao, Y. Y. (2022). Applications of artificial intelligence to obesity research: scoping review of methodologies. Journal of Medical Internet Research, 24(12), Article e40589. https://www.jmir.org/2022/12/e40589. 2. Bengio, Y., & Hu, E. J. (2023, March 21). Scaling in the service of reasoning & model-based ML. Yoshua Bengio. Retrieved May 7, 2023, from https://yoshuabengio.org/2023/03/21/scaling-in-the-service-of-reasoning-model-based-ml/. 3. Bontridder, N., & Poullet, Y. (2021). The role of artificial intelligence in disinformation. Data & Policy, 3, Article e32. https://doi.org/10.1017/dap.2021.20. 4. Dorr, B., Zajic, D., & Schwartz, R. (2003). Hedge trimmer: A parse-and-trim approach to headline generation. MARYLAND UNIV COLLEGE PARK INST FOR ADVANCED COMPUTER STUDIES. https://dl.acm.org/doi/10.3115/1119467.1119468. 5. Fan, M. Y., Huang, Y. C., Qalati, S. A., Shah, S. M. M., Ostic, D., & Pu, Z. J. (2021). Effects of information overload, communication overload, and inequality on digital distrust: a cyber-violence behavior mechanism. Frontiers in Psychology, 12, Article 643981. https://doi.org/10.3389/fpsyg.2021.643981.
|
|