Author:
Liu Yijin,Meng Fandong,Zhou Jie,Chen Yufeng,Xu Jinan
Abstract
Depth-adaptive neural networks can dynamically adjust depths according to the hardness of input words, and thus improve efficiency. The main challenge is how to measure such hardness and decide the required depths (i.e., layers) to conduct. Previous works generally build a halting unit to decide whether the computation should continue or stop at each layer. As there is no specific supervision of depth selection, the halting unit may be under-optimized and inaccurate, which results in suboptimal and unstable performance when modeling sentences. In this paper, we get rid of the halting unit and estimate the required depths in advance, which yields a faster depth-adaptive model. Specifically, two approaches are proposed to explicitly measure the hardness of input words and estimate corresponding adaptive depth, namely 1) mutual information (MI) based estimation and 2) reconstruction loss based estimation. We conduct experiments on the text classification task with 24 datasets in various sizes and domains. Results confirm that our approaches can speed up the vanilla Transformer (up to 7x) while preserving high accuracy. Moreover, efficiency and robustness are significantly improved when compared with other depth-adaptive approaches.
Publisher
Association for the Advancement of Artificial Intelligence (AAAI)
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. A flexible BERT model enabling width- and depth-dynamic inference;Computer Speech & Language;2024-08
2. From Static to Dynamic: A Deeper, Faster, and Adaptive Language Modeling Approach;2024 International Joint Conference on Neural Networks (IJCNN);2024-06-30
3. Long-term rolling prediction of transformer power load capacity based on the informer model;Journal of Physics: Conference Series;2024-06-01
4. Flexible BERT with Width- and Depth-dynamic Inference;2023 International Joint Conference on Neural Networks (IJCNN);2023-06-18
5. Boosting Bert Subnets with Neural Grafting;ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2023-06-04