Random walks with variable restarts for negative-example-informed label propagation-Reference-Cited by-同舟云学术

Random walks with variable restarts for negative-example-informed label propagation

Published:2024-08-13 Issue: Volume: Page:
ISSN:1384-5810
Container-title:Data Mining and Knowledge Discovery
language:en
Short-container-title:Data Min Knowl Disc

Author:

Maxwell Sean^ORCID,Koyutürk Mehmet

Abstract

AbstractLabel propagation is frequently encountered in machine learning and data mining applications on graphs, either as a standalone problem or as part of node classification. Many label propagation algorithms utilize random walks (or network propagation), which provide limited ability to take into account negatively-labeled nodes (i.e., nodes that are known to be not associated with the label of interest). Specialized algorithms to incorporate negatively-labeled nodes generally focus on learning or readjusting the edge weights to drive walks away from negatively-labeled nodes and toward positively-labeled nodes. This approach has several disadvantages, as it increases the number of parameters to be learned, and does not necessarily drive the walk away from regions of the network that are rich in negatively-labeled nodes. We reformulate random walk with restarts and network propagation to enable “variable restarts", that is the increased likelihood of restarting at a positively-labeled node when a negatively-labeled node is encountered. Based on this reformulation, we develop CusTaRd, an algorithm that effectively combines variable restart probabilities and edge re-weighting to avoid negatively-labeled nodes. To assess the performance of CusTaRd, we perform comprehensive experiments on network datasets commonly used in benchmarking label propagation and node classification algorithms. Our results show that CusTaRd consistently outperforms competing algorithms that learn edge weights or restart profiles, and that negatives close to positive examples are generally more informative than more distant negatives.

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s10618-024-01065-4.pdf

Reference27 articles.

1. Adamic LA, Glance N (2005) The political blogosphere and the 2004 us election: divided they blog. In: Proceedings of the 3rd international workshop on Link discovery, pp 36–43

2. Backstrom L, Leskovec J (2011) Supervised random walks: predicting and recommending links in social networks. In: Proceedings of the Fourth ACM international conference on web search and data mining, WSDM ’11, New York. Association for Computing Machinery, pp 635–644. ISBN 9781450304931. https://doi.org/10.1145/1935826.1935914

3. Barel G, Herwig R (2020) Netcore: a network propagation approach using node coreness. Nucl Acids Res 48(17):e98–e98

4. Berberidis D, Nikolakopoulos A, Giannakis G (2018) Random walks with restarts for graph-based classification: teleportation tuning and sampling design. In: 2018 IEEE international conference on acoustics, speech, and signal processing, ICASSP 2018—proceedings, volume 2018-April of ICASSP, IEEE international conference on acoustics, speech and signal processing-proceedings. Institute of Electrical and Electronics Engineers Inc., pp 2811–2815. ISBN 9781538646588. https://doi.org/10.1109/ICASSP.2018.8461548

5. Cowen L, Ideker T, Raphael BJ, Sharan R (2017) Network propagation: a universal amplifier of genetic associations. Nat Rev Genet 18(9):551–562