Abstract
Despite the popularity and efficiency of dictionary-based sentiment analysis (DSA) for public health research, limited empirical evidence has been produced about the validity of DSA and potential harms to the validity of DSA. A random sample of a second-hand Ebola tweet dataset was used to evaluate the validity of DSA compared to the manual coding approach and examine the influences of textual features on the validity of DSA. The results revealed substantial inconsistency between DSA and the manual coding approach. The presence of certain textual features such as negation can partially account for the inconsistency between DSA and manual coding. The findings imply that scholars should be careful and critical about findings in disease-related public health research that use DSA. Certain textual features should be more carefully addressed in DSA.
Subject
Health, Toxicology and Mutagenesis,Public Health, Environmental and Occupational Health
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献