Ensemble of deep learning language models to support the creation of living systematic reviews for the COVID-19 literature

Author:

Knafou JulienORCID,Haas Quentin,Borissov Nikolay,Counotte MichelORCID,Low NicolaORCID,Imeri Hira,Ipekci Aziz Mert,Buitrago-Garcia Diana,Heron Leonie,Amini Poorya,Teodoro DouglasORCID

Abstract

AbstractBackgroundThe COVID-19 pandemic has led to an unprecedented amount of scientific publications, growing at a pace never seen before. Multiple living systematic reviews have been developed to assist professionals with up-to-date and trustworthy health information, but it is increasingly challenging for systematic reviewers to keep up with the evidence in electronic databases. We aimed to investigate deep learning-based machine learning algorithms to classify COVID-19 related publications to help scale-up the epidemiological curation process.MethodsIn this retrospective study, five different pre-trained deep learning-based language models were fine-tuned on a dataset of 6,365 publications manually classified into two classes, three subclasses and 22 sub-subclasses relevant for epidemiological triage purposes. In ak-fold cross-validation setting, each standalone model was assessed on a classification task and compared against an ensemble, which takes the standalone model predictions as input and uses different strategies to infer the optimal article class. A ranking task was also considered, in which the model outputs a ranked list of sub-subclasses associated with the article.ResultsThe ensemble model significantly outperformed the standalone classifiers, achieving a F1-score of 89.2 at the class level of the classification task. The difference between the standalone and ensemble models increases at the sub-subclass level, where the ensemble reaches a micro F1-score of 70% against 67% for the best performing standalone model. For the ranking task, the ensemble obtained the highest recall@3, with a performance of 89%. Using an unanimity voting rule, the ensemble can provide predictions with higher confidence on a subset of the data, achieving detection of original papers with a F1-score up to 97% on a subset of 80% of the collection instead of 93% on the whole dataset.ConclusionThis study shows the potential of using deep learning language models to perform triage of COVID-19 references efficiently and support epidemiological curation and review. The ensemble consistently and significantly outperforms any standalone model. Fine-tuning the voting strategy thresholds is an interesting alternative to annotate a subset with higher predictive confidence.

Publisher

Cold Spring Harbor Laboratory

Reference47 articles.

1. LitCovid: an open database of COVID-19 literature

2. Ipekci AM , Buitrago-Garcia D , Meili KW , Krauer F , Prajapati N , Thapa S , et al. Outbreaks of publications about emerging infectious diseases: the case of SARS-CoV-2 and Zika virus. BMC Med Res Methodol. 2021;50–50.

3. Lu Wang L , Lo K , Chandrasekhar Y , Reas R , Yang J , Eide D , et al. CORD-19: The Covid-19 Open Research Dataset. 2020 [cited 2022 Jun 29]; Available from: https://search.bvsalud.org/global-literature-on-novel-coronavirus-2019-ncov/resource/en/ppcovidwho-2130

4. Counotte M , Imeri H , Leonie H , Ipekci M , Low N. Living Evidence on COVID-19 [Internet]. 2020 [cited 2022 Jun 29]. Available from: https://ispmbern.github.io/covid-19/living-review/

5. The COVID-NMA initiative [Internet]. [cited 2022 Jun 29]. Available from: https://covid-nma.com/

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3