SPORT: A Subgraph Perspective on Graph Classification with Label Noise-Reference-Cited by-同舟云学术

SPORT: A Subgraph Perspective on Graph Classification with Label Noise

Published:2024-08-28 Issue: Volume: Page:
ISSN:1556-4681
Container-title:ACM Transactions on Knowledge Discovery from Data
language:en
Short-container-title:ACM Trans. Knowl. Discov. Data

Author:

Yin Nan¹^ORCID,Shen Li²^ORCID,Chen Chong³^ORCID,Hua Xian-Sheng³^ORCID,Luo Xiao⁴^ORCID

Affiliation:

1. Mohamed bin Zayed University of Artificial Intelligence, UAE

2. Sun Yat-sen University, Shenzhen & JD Explore Academy, China

3. Terminus Group, China

4. University of California, Los Angeles, USA

Abstract

Graph neural networks (GNNs) have achieved great success recently on graph classification tasks using supervised end-to-end training. Unfortunately, extensive noisy graph labels could exist in the real world because of the complicated processes of manual graph data annotations, which may significantly degrade the performance of GNNs. Therefore, we investigate the problem of graph classification with label noise, which is demanding because of the complex graph representation learning issue and serious memorization of noisy samples. In this work, we present a novel approach called S ubgra p h Set Netw or k with Sample Selection and Consis t ency Learning (SPORT) for this problem. To release the overfitting of GNNs, SPORT proposes to characterize each graph as a set of subgraphs generated by certain predefined stratagems, which can be viewed as samples from its underlying semantic distribution in graph space. Then we develop an equivariant network to encode the subgraph set with the consideration of the symmetry group. To further release the influences of noisy examples, we leverage the predictions of subgraphs to measure the likelihood of a sample being clean or noisy, followed by effective label updating. In addition, we propose a joint loss to advance the model generalizability by introducing consistency regularization. Comprehensive experiments on a wide range of graph classification datasets demonstrate the effectiveness of our SPORT. Specifically, SPORT outperforms the most competing baseline by up to 6.4%.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3687468

Reference91 articles.

1. Elad Amrani Rami Ben-Ari Daniel Rotman and Alex Bronstein. 2021. Noise estimation using density estimation for self-supervised multimodal learning. In AAAI.

2. Devansh Arpit, Stanisław Jastrzkebski, Nicolas Ballas, David Krueger, Emmanuel Bengio, Maxinder S Kanwal, Tegan Maharaj, Asja Fischer, Aaron Courville, Yoshua Bengio, et al. 2017. A closer look at memorization in deep networks. In ICML.

3. Jinheon Baek Minki Kang and Sung Ju Hwang. 2021. Accurate Learning of Graph Representations with Graph Multiset Pooling. In ICLR.

4. Beatrice Bevilacqua, Fabrizio Frasca, Derek Lim, Balasubramaniam Srinivasan, Chen Cai, Gopinath Balamurugan, Michael M Bronstein, and Haggai Maron. 2021. Equivariant subgraph aggregation networks. arXiv preprint arXiv:2110.02910 (2021).

5. Beatrice Bevilacqua Yangze Zhou and Bruno Ribeiro. 2021. Size-invariant graph representations for graph classification extrapolations. In ICML.