Using Twitter To Generate Signals For The Enhancement Of Syndromic Surveillance Systems: Semi-Supervised Classification For Relevance Filtering in Syndromic Surveillance

Author:

Edo-Osagie OduwaORCID,Smith Gillian,Lake Iain,Edeghere Obaghe,De La Iglesia Beatriz

Abstract

AbstractWe investigate the use of Twitter data to deliver signals for syndromic surveillance in order to assess its ability to augment existing syndromic surveillance efforts and give a better understanding of symptomatic people who do not seek health care advice directly. We focus on a specific syndrome - asthma/difficulty breathing. We outline data collection using the Twitter streaming API as well as analysis and pre-processing of the collected data. Even with keyword-based data collection, many of the tweets collected are not be relevant because they represent chatter, or talk of awareness instead of suffering a particular condition. In light of this, we set out to identify relevant tweets to collect a strong and reliable signal. For this, we investigate text classification techniques, and in particular we focus on semi-supervised classification techniques since they enable us to use more of the Twitter data collected without needing to label it all. In this paper, propose a semi-supervised approach to symptomatic tweet classification and relevance filtering. We also propose the use of emojis and other special features capturing the tweet’s tone to improve the classification performance. Our results show that negative emojis and those that denote laughter provide the best classification performance in conjunction with a simple bag of words approach. We obtain good performance on classifying symptomatic tweets with both supervised and semi-supervised algorithms and found that the proposed semi-supervised algorithms preserve more of the relevant tweets and may be advantegeous in the context of a weak signal. Finally, we found some correlation (r = 0.414, p = 0.0004) between the Twitter signal generated with the semi-supervised system and data from consultations for related health conditions.

Publisher

Cold Spring Harbor Laboratory

Reference48 articles.

1. World Health Organisation WHO. The world health report 2007 - A safer future: global public health security in the 21st century; 2007. Available at: http://www.who.int/whr/2007/en/.

2. Syndromic Surveillance: Adapting Innovations to Developing Settings

3. Assessment of syndromic surveillance in Europe;Lancet (London, England),2011

4. Achrekar H , Gandhe A , Lazarus R , Yu SH , Liu B . Twitter Improves Seasonal Influenza Prediction. In: Healthinf; 2012. p. 61–70.

5. Using social media for actionable disease surveillance and outbreak management: A systematic literature review;PloS one,2015

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3