Poisoning scientific knowledge using large language models

Author:

Yang Junwei,Xu Hanwen,Mirzoyan Srbuhi,Chen Tong,Liu Zixuan,Ju Wei,Liu Luchen,Zhang Ming,Wang Sheng

Abstract

AbstractBiomedical knowledge graphs constructed from scientific literature have been widely used to validate biological discoveries and generate new hypotheses. Recently, large language models (LLMs) have demonstrated a strong ability to generate human-like text data. While most of these text data have been useful, LLM might also be used to generate malicious content. Here, we investigate whether it is possible that a malicious actor can use LLM to generate a malicious paper that poisons scientific knowledge graphs and further affects downstream biological applications. As a proof-of-concept, we develop Scorpius, a conditional text generation model that generates a malicious paper abstract conditioned on a promoting drug and a target disease. The goal is to fool the knowledge graph constructed from a mixture of this malicious abstract and millions of real papers so that knowledge graph consumers will misidentify this promoting drug as relevant to the target disease. We evaluated Scorpius on a knowledge graph constructed from 3,818,528 papers and found that Scorpius can increase the relevance of 71.3% drug disease pairs from the top 1000 to the top 10 by only adding one malicious abstract. Moreover, the generation of Scorpius achieves better perplexity than ChatGPT, suggesting that such malicious abstracts cannot be efficiently detected by humans. Collectively, Scorpius demonstrates the possibility of poisoning scientific knowledge graphs and manipulating downstream applications using LLMs, indicating the importance of accountable and trustworthy scientific knowledge discovery in the era of LLM.

Publisher

Cold Spring Harbor Laboratory

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3