Trust evaluation of health websites by eliminating phishing websites and using similarity techniques-Reference-Cited by-同舟云学术

Trust evaluation of health websites by eliminating phishing websites and using similarity techniques

Published:2023-03-16 Issue:21 Volume:35 Page:
ISSN:1532-0626
Container-title:Concurrency and Computation: Practice and Experience
language:en
Short-container-title:Concurrency and Computation

Author:

Gupta Sarika¹,Bansal Himani¹

Affiliation:

1. Department of CSE & IT Jaypee Institute of Information Technology Noida India

Abstract

SummaryEvery user uses a search engine to find health information from websites. Content‐rich health websites are considered in our research as wrong information in these websites can threaten life. Search engines give a list of URLs related to their search keyword. Generally, the user follows the top websites displayed by the search engine. Newly constructed websites do not have ratings, hit counts, and reviews. The search engine does not display newly constructed websites in their top rank. In such a case, the newly constructed website with the same content as the website displayed at the top of the search engine loses the user's trust. Another problem is; the phishing website URLs are also displayed by the Google Search engine, which appear similar to the genuine websites. To solve the problem and enhance the trust of health websites which is not at the top of the search engine among users, we have proposed an approach that extracts all URLs based on the keyword. It identifies all legitimate URLs using a Machine Learning classifier. Address bar features, Domain name features, HTML, and JavaScript features were identified for the dataset of getting legitimate URLs. Three classifiers (Decision Tree, Random Forest, and Support Vector Machine) were trained and evaluated. Decision Tree has the highest training accuracy, 94.125, testing accuracy, 92.75, and precision score of 96.97. The cross‐validation score of all three models is almost 93. Therefore, Decision tree is used to identify legitimate websites. After getting the list of legitimate URLs, all the content of the legitimate website is extracted. A Semantic Similarity between top‐rank legitimate website content and legitimate websites is found using Natural language processing techniques. Then the websites are ranked based on similarity and the value of the trust is assigned from highly trustable to less trustable. We have compared and correlated our results with the Web of Trust, a reputation tool for trust analysis, and have achieved a positive correlation. Thus, our approach removes phishing websites and enhances the trust in other websites that are not at the top of the search engine.

Publisher

Wiley

Subject

Computational Theory and Mathematics,Computer Networks and Communications,Computer Science Applications,Theoretical Computer Science,Software

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.7695

Reference46 articles.

1. “Website ” Wikipedia.2021. Accessed August 2021.https://en.wikipedia.org/w/index.php?title=Website&oldid=1026709417.

2. On Deep Learning for Trust-Aware Recommendations in Social Networks

3. A systematic review and research perspective on recommender systems

4. Facing the cold start problem in recommender systems

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A cyber defense system against phishing attacks with deep learning game theory and LSTM-CNN with African vulture optimization algorithm (AVOA);International Journal of Information Security;2024-05-05