Automated Extraction and Visualization of Metabolic Networks from Biomedical Literature Using a Large Language Model

Author:

Phongwattana Thiptanawat,Chan Jonathan H

Abstract

The rapid growth of biomedical literature presents a significant challenge for researchers to extract and analyze relevant information efficiently. In this study, we explore the application of GPT, the large language model to automate the extraction and visualization of metabolic networks from a corpus of PubMed abstracts. Our objective is to provide a valuable tool for biomedical researchers to explore and understand the intricate metabolic interactions discussed in scientific literature. We begin by splitting a ton of the tokens within the corpus, as the GPT-3.5-Turbo model has a token limit of 4,000 per analysis. Through iterative prompt optimization, we successfully extract a comprehensive list of metabolites, enzymes, and proteins from the abstracts. To validate the accuracy and completeness of the extracted entities, our biomedical data domain experts compare them with the provided abstracts and ensure a fully matched result. Using the extracted entities, we generate a directed graph that represents the metabolic network including 3 types of metabolic events that consist of metabolic consumption, metabolic reaction, and metabolic production. The graph visualization, achieved through Python and NetworkX, offers a clear representation of metabolic pathways, highlighting the relationships between metabolites, enzymes, and proteins. Our approach integrates language models and network analysis, demonstrating the power of combining automated information extraction with sophisticated visualization techniques. The research contributions are twofold. Firstly, we showcase the ability of GPT-3.5-Turbo to automatically extract metabolic entities, streamlining the process of cataloging important components in metabolic research. Secondly, we present the generation and visualization of a directed graph that provides a comprehensive overview of metabolic interactions. This graph serves as a valuable tool for further analysis, comparison with existing pathways, and updating or refining metabolic networks. Our findings underscore the potential of large language models and network analysis techniques in extracting and visualizing metabolic information from scientific literature. This approach enables researchers to gain insights into complex biological systems, advancing our understanding of metabolic pathways and their components.

Publisher

Cold Spring Harbor Laboratory

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Analyzing the Future of ChatGPT in Medical Research;Artificial Intelligence Applications Using ChatGPT in Education;2023-09-15

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3