Affiliation:
1. Northwestern Polytechnical University, China
2. San Diego State University, USA
3. Indiana University, USA
4. Institute of Automation Chinese Academy of Sciences, China
Abstract
Learning causality from large-scale text corpora is an important task with numerous applications—for example, in finance, biology, medicine, and scientific discovery. Prior studies have focused mainly on simple causality, which only includes one cause-effect pair. However, causality is notoriously difficult to understand and analyze because of multiple cause spans and their entangled interactions. To detect complex causality, we propose a self-paced contrastive learning model, namely N2NCause, to learn entangled interactions between multiple spans. Specifically, N2NCause introduces data enhancement operations to convert implicit expressions into explicit expressions with the most rational causal connectives for the synthesis of positive samples and to invert the directed connection between a cause-effect pair for the synthesis of negative samples. To learn the semantic dependency and causal direction of positive and negative samples, self-paced contrastive learning is proposed to learn the entangled interactions among spans, including the interaction direction and interaction field. We evaluated the performance of N2NCause in three cause-effect detection tasks. The experimental results show that, with the least data annotation efforts, N2NCause demonstrates competitive performance in detecting simple cause-effect relations, and it is superior to existing solutions for the detection of complex causality.
Funder
Natural Science Foundation of China
Publisher
Association for Computing Machinery (ACM)