Search-engine-based surveillance using artificial intelligence for early detection of coronavirus disease outbreak

Author:

Wang Ligui,Liu Yuqi,Chen Hui,Qiu Shaofu,Liu Yonghong,Yang Mingjuan,Du Xinying,Li Zhenjun,Hao Rongzhang,Tian Huaiyu,Song Hongbin

Abstract

AbstractThe search-engine-based surveillance methods for the early warning and prediction of infectious diseases cannot achieve search engine keywords automatic filtering and real-time updating, lead to powerless for the early warning of emerging infectious diseases. The aim of this study is to develop an artificial intelligence (AI) method for search-engine-based surveillance to improve the early warning ability for emerging infectious diseases. The 32 keywords (444 million search queries) that may be related to the coronavirus disease (COVID-19) outbreak was collected from December 18, 2019 to February 11, 2020 from Baidu’s search engine database. The graph convolution network (GCN) model was used to select search engine keywords automatically, and then, multiple linear regression was performed to explore the relationship between the daily query frequencies of keywords and daily new cases. The GCN model was used to automatically select keywords. The prediction trend of the GCN model was highly consistent with the true curve with a mean absolute error of 81.65. Three keywords including “epidemic”, “mask” and “coronavirus” were selected. The selection keywords in the search queries were highly correlated with the daily number of confirmed cases (r = 0.96, 0.94, and 0.89; P < 0.01). An abnormal initial peak (3.05 times the normal volume) in queries appeared on December 31, 2019, which could have served as an early warning signal for an outbreak. Of particular concern, 17.5% of query volume originated from the Hubei Province, 51.15% of which was from Wuhan City. The coefficients of determination (R2) of our constructed model were 0.88, 0.88, 0.84, 0.77, 0.77, 0.75, 0.73, and 0.73 for a time lag of 0–7 days, respectively, using selection keywords. The model we constructed was used in the Beijing Xinfadi outbreak as an independent test dataset, which successfully predicted the daily numbers of cases for the following days and detected an early signal during the Beijing Xinfadi outbreak (R2 = 0.79). In this paper search-engine-based surveillance based on the AI method was established for the early detection of the COVID-19 epidemic for the first time. The model achieves automatic filtering and real-time updating of search engine keywords and can effectively detect the early signals of emerging infectious diseases.

Funder

National Key R&D Program of China

Publisher

Springer Science and Business Media LLC

Subject

Information Systems and Management,Computer Networks and Communications,Hardware and Architecture,Information Systems

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3