Abstract
AbstractGraph sampling plays an important role in data mining for large networks. Specifically, larger networks often correspond to lower sampling rates. Under the situation, traditional traversal-based samplings for large networks usually have an excessive preference for densely-connected network core nodes. Aim at this issue, this paper proposes a sampling method for unknown networks at low sampling rates, called SLSR, which first adopts a random node sampling to evaluate a degree threshold, utilized to distinguish the core from periphery, and the average degree in unknown networks, and then runs a double-layer sampling strategy on the core and periphery. SLSR is simple that results in a high time efficiency, but experiments verify that the proposed method can accurately preserve many critical structures of unknown large scale-free networks with low sampling rates and low variances.
Publisher
Springer Science and Business Media LLC
Reference46 articles.
1. Zeng, H., Zhou, H., Srivastava, A., Kannan, R. & Prasanna, V. Graphsaint: Graph sampling based inductive learning method. In Eighth International Conference on Learning Representations, Virtual Conference, Formerly Addis Ababa ETHIOPIA, April 26–30 (2020).
2. Zheng, T. & Wang, L. Large graph sampling algorithm for frequent subgraph mining. IEEE Access 9, 88970–88980 (2021).
3. Jiang, P., Wei, Y., Su, J., Wang, R. & Wu, B. SampleMine: A framework for applying random sampling to subgraph pattern mining through loop perforation. In International Conference on Parallel Architectures and Compilation Techniques, Chicago, USA (2022).
4. Zhu, M. et al. DRGraph: An efficient graph layout algorithm for large-scale graphs by dimensionality reduction. IEEE Trans. Visual Comput. Graph. 27(2), 1666–1676 (2021).
5. Leskovec, J. & Faloutsos, C. Sampling from large graphs. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 631–636 (2006).