Sample-Adaptive Classification Inference Network-Reference-Cited by-同舟云学术

Sample-Adaptive Classification Inference Network

Published:2024-05-28 Issue:3 Volume:56 Page:
ISSN:1573-773X
Container-title:Neural Processing Letters
language:en
Short-container-title:Neural Process Lett

Author:

Yang Juan,Zhou Guanghong,Wang Ronggui,Xue Lixia

Abstract

AbstractExisting pre-trained models have yielded promising results in terms of computational time reduction. However, these models only focus on pruning simple sentences or less salient words, while neglecting the treatment of relatively complex sentences. It is frequently these sentences that cause the loss of model accuracy. This shows that the adaptation of the existing models is one-sided. To address this issue, in this paper, we propose a sample-adaptive training and inference model. Specifically, complex samples are extracted from the training datasets and a dedicated data augmentation module is trained to extract global and local semantic information of complex samples. During inference, simple samples can exit the model via the Sample Adaptive Exit Mechanism, Normal samples pass through the whole backbone model before inference, while complex samples are processed by the Characteristic Enhancement Module after passing through the backbone model. In this way, all samples are processed adaptively. Our extensive experiments on classification tasks datasets in the field of Natural Language Processing demonstrate that our method enhances model accuracy and reduces model inference time for multiple datasets. Moreover, our method is transferable and can be applied to multiple pre-trained models.

Funder

National Natural Science Foundation of China

National Key R&D Program of China

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s11063-024-11629-6.pdf

Reference30 articles.

1. Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: North American chapter of the association for computational linguistics

2. Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized bert pretraining approach. arXiv:1907.11692

3. Reimers N, Gurevych I (2019) Sentence-bert: sentence embeddings using siamese bert-networks. In: Conference on empirical methods in natural language processing

4. Zhang W, Hou L, Yin Y, Shang L, Chen X, Jiang X, Liu Q (2020) Ternarybert: distillation-aware ultra-low bit bert. In: Conference on empirical methods in natural language processing

5. Sanh V, Debut L, Chaumond J, Wolf T (2019) Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv:1910.01108