TLC-XML: Transformer with Label Correlation for Extreme Multi-label Text Classification-Reference-Cited by-同舟云学术

TLC-XML: Transformer with Label Correlation for Extreme Multi-label Text Classification

Published:2024-02-10 Issue:1 Volume:56 Page:
ISSN:1573-773X
Container-title:Neural Processing Letters
language:en
Short-container-title:Neural Process Lett

Author:

Zhao Fei,Ai Qing,Li Xiangna,Wang Wenhui,Gao Qingyun,Liu Yichun

Abstract

AbstractExtreme multi-label text classification (XMTC) annotates related labels for unknown text from large-scale label sets. Transformer-based methods have become the dominant approach for solving the XMTC task due to their effective text representation capabilities. However, the existing Transformer-based methods fail to effectively exploit the correlation between labels in the XMTC task. To address this shortcoming, we propose a novel model called TLC-XML, i.e., a Transformer with label correlation for extreme multi-label text classification. TLC-XML comprises three modules: Partition, Matcher and Ranker. In the Partition module, we exploit the semantic and co-occurrence information of labels to construct the label correlation graph, and further partition the strongly correlated labels into the same cluster. In the Matcher module, we propose cluster correlation learning, which uses the graph convolutional network (GCN) to extract the correlation between clusters. We then introduce these valuable correlations into the classifier to match related clusters. In the Ranker module, we propose label interaction learning, which aggregates the raw label prediction with the information of the neighboring labels. The experimental results on benchmark datasets show that TLC-XML significantly outperforms state-of-the-art XMTC methods.

Funder

Natural Science Foundation of Liaoning Province

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s11063-024-11460-z.pdf

Reference37 articles.

1. McAuley, J.J., Pandey, R., Leskovec, J.: Inferring networks of substitutable and complementary products. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pp 785-794 (2015)

2. Jung G, Shin J, Lee S (2023) Impact of preprocessing and word embedding on extreme multi-label patent classification tasks. Applied Intelligence 53(4):4047–4062

3. Jain, H., Balasubramanian, V., Chunduri, B., Varma, M.: Slice: Scalable linear extreme classifiers trained on 100 million labels for related searches. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, pp 528-536 (2019)

4. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for Language understanding. arXiv preprint arXiv:1810.04805 (2018)

5. Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: A lite BERT for self-supervised learning of language representations. International Conference on Learning Representations, pp. 25-32 (2020)

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Dual-view graph convolutional network for multi-label text classification;Applied Intelligence;2024-07-15

2. Multi-sentence and multi-intent classification using RoBERTa and graph convolutional neural network;2024-02-21