Affiliation:
1. University of Virginia
2. University of Wisconsin-Milwaukee
3. Towson University
Abstract
Fake medical Web sites have become increasingly prevalent. Consequently, much of the health-related information and advice available online is inaccurate and/or misleading. Scores of medical institution Web sites are for organizations that do not exist and more than 90% of online pharmacy Web sites are fraudulent. In addition to monetary losses exacted on unsuspecting users, these fake medical Web sites have severe public safety ramifications. According to a World Health Organization report, approximately half the drugs sold on the Web are counterfeit, resulting in thousands of deaths. In this study, we propose an adaptive learning algorithm called recursive trust labeling (RTL). RTL uses underlying content and graph-based classifiers, coupled with a recursive labeling mechanism, for enhanced detection of fake medical Web sites. The proposed method was evaluated on a test bed encompassing nearly 100 million links between 930,000 Web sites, including 1,000 known legitimate and fake medical sites. The experimental results revealed that RTL was able to significantly improve fake medical Web site detection performance over 19 comparison content and graph-based methods, various meta-learning techniques, and existing adaptive learning approaches, with an overall accuracy of over 94%. Moreover, RTL was able to attain high performance levels even when the training dataset composed of as little as 30 Web sites. With the increased popularity of eHealth and Health 2.0, the results have important implications for online trust, security, and public safety.
Funder
Division of Computer and Network Systems
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Science Applications,General Business, Management and Accounting,Information Systems
Cited by
44 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献