Affiliation:
1. Eurecat - Centre Tecnògic of Catalunya, Barcelona, Spain
2. DISI, University of Trento, Trento, Italy
Abstract
Surfing the links between Wikipedia articles constitutes a valuable way to acquire new knowledge related to a topic by exploring its connections to other pages. In this sense,
Personalized PageRank
is a well-known option to make sense of the graph of links between pages and identify the most relevant articles with respect to a given one; its performance, however, is hindered by pages with high indegree that function as hubs and obtain high scores regardless of the starting point. In this work, we present
CycleRank
, a novel algorithm based on cyclic paths aimed at finding the most relevant nodes related to a topic. To compare the results of
CycleRank
with those of
Personalized PageRank
and other algorithms derived from it, we perform three experiments based on different ground truths. We find that
CycleRank
aligns better with readers’ behaviour as it ranks in higher positions the articles corresponding to links that receive more clicks; it tends to identify in higher position related articles highlighted by editors in ‘See also’ sections; and it is more robust to global hubs of the network having high indegree. Finally, we show that computing
CycleRank
is two orders of magnitude faster than computing the other baselines.
Funder
Horizon 2020 Framework Programme
Subject
General Physics and Astronomy,General Engineering,General Mathematics
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Comparing Personalized Relevance Algorithms for Directed Graphs;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13
2. Wikipedia Reader Navigation;Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining;2022-02-11
3. A general method for estimating the prevalence of influenza-like-symptoms with Wikipedia data;PLOS ONE;2021-08-31