Invariant Node Representation Learning under Distribution Shifts with Multiple Latent Environments
-
Published:2023-08-18
Issue:1
Volume:42
Page:1-30
-
ISSN:1046-8188
-
Container-title:ACM Transactions on Information Systems
-
language:en
-
Short-container-title:ACM Trans. Inf. Syst.
Author:
Li Haoyang1ORCID,
Zhang Ziwei1ORCID,
Wang Xin2ORCID,
Zhu Wenwu2ORCID
Affiliation:
1. Department of Computer Science and Technology, Tsinghua University, China
2. Department of Computer Science and Technology, BNRist, Tsinghua University, China
Abstract
Node representation learning methods, such as graph neural networks, show promising results when testing and training graph data come from the same distribution. However, the existing approaches fail to generalize under distribution shifts when the nodes reside in multiple latent environments. How to learn invariant node representations to handle distribution shifts with multiple latent environments remains unexplored. In this article, we propose a novel
I
nvariant
N
ode representation
L
earning (INL) approach capable of generating invariant node representations based on the invariant patterns under distribution shifts with multiple latent environments by leveraging the invariance principle. Specifically, we define invariant and variant patterns as ego-subgraphs of each node and identify the invariant ego-subgraphs through jointly accounting for node features and graph structures. To infer the latent environments of nodes, we propose a contrastive modularity-based graph clustering method based on the variant patterns. We further propose an invariant learning module to learn node representations that can generalize to distribution shifts. We theoretically show that our proposed method can achieve guaranteed performance under distribution shifts. Extensive experiments on both synthetic and real-world node classification benchmarks demonstrate that our method greatly outperforms state-of-the-art baselines under distribution shifts.
Funder
National Key Research and Development Program of China
National Natural Science Foundation of China
Beijing National Research Center for Information Science and Technology
Beijing Key Lab of Networked Multimedia, China National Postdoctoral Program for Innovative Talents
China Postdoctoral Science Foundation
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Science Applications,General Business, Management and Accounting,Information Systems
Reference100 articles.
1. Deep learning using rectified linear units (ReLU);Agarap Abien Fred;arXiv preprint arXiv:1803.08375,2018
2. Invariance principle meets information bottleneck for out-of-distribution generalization;Ahuja Kartik;Neural Information Processing Systems (NeurIPS),2021
3. Kartik Ahuja, Jun Wang, Amit Dhurandhar, Karthikeyan Shanmugam, and Kush R. Varshney. 2021. Empirical or invariant risk minimization? A sample complexity perspective. In International Conference on Learning Representations.
4. Invariant risk minimization;Arjovsky Martin;arXiv preprint arXiv:1907.02893,2019
5. Network biology: understanding the cell's functional organization