Data extraction for evidence synthesis using a large language model: A proof‐of‐concept study

Author:

Gartlehner Gerald12ORCID,Kahwati Leila1ORCID,Hilscher Rainer1,Thomas Ian1,Kugley Shannon1,Crotty Karen1,Viswanathan Meera1,Nussbaumer‐Streit Barbara2,Booth Graham1,Erskine Nathaniel3ORCID,Konet Amanda1ORCID,Chew Robert1

Affiliation:

1. Social, Statistical, and Environmental Sciences, RTI International Research Triangle Park North Carolina USA

2. Department for Evidence‐based Medicine and Evaluation Danube University Krems Krems Austria

3. Preventive Medicine Residency Program, Department of Family Medicine, School of Medicine University of North Carolina at Chapel Hill Chapel Hill North Carolina USA

Abstract

AbstractData extraction is a crucial, yet labor‐intensive and error‐prone part of evidence synthesis. To date, efforts to harness machine learning for enhancing efficiency of the data extraction process have fallen short of achieving sufficient accuracy and usability. With the release of large language models (LLMs), new possibilities have emerged to increase efficiency and accuracy of data extraction for evidence synthesis. The objective of this proof‐of‐concept study was to assess the performance of an LLM (Claude 2) in extracting data elements from published studies, compared with human data extraction as employed in systematic reviews. Our analysis utilized a convenience sample of 10 English‐language, open‐access publications of randomized controlled trials included in a single systematic review. We selected 16 distinct types of data, posing varying degrees of difficulty (160 data elements across 10 studies). We used the browser version of Claude 2 to upload the portable document format of each publication and then prompted the model for each data element. Across 160 data elements, Claude 2 demonstrated an overall accuracy of 96.3% with a high test–retest reliability (replication 1: 96.9%; replication 2: 95.0% accuracy). Overall, Claude 2 made 6 errors on 160 data items. The most common errors (n = 4) were missed data items. Importantly, Claude 2's ease of use was high; it required no technical expertise or labeled training data for effective operation (i.e., zero‐shot learning). Based on findings of our proof‐of‐concept study, leveraging LLMs has the potential to substantially enhance the efficiency and accuracy of data extraction for evidence syntheses.

Publisher

Wiley

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3