Knowledge Distillation with Attention for Deep Transfer Learning of Convolutional Networks-Reference-Cited by-同舟云学术

Knowledge Distillation with Attention for Deep Transfer Learning of Convolutional Networks

Published:2022-06-30 Issue:3 Volume:16 Page:1-20
ISSN:1556-4681
Container-title:ACM Transactions on Knowledge Discovery from Data
language:en
Short-container-title:ACM Trans. Knowl. Discov. Data

Author:

Li Xingjian¹,Xiong Haoyi²,Chen Zeyu²,Huan Jun³,Liu Ji²,Xu Cheng-Zhong⁴,Dou Dejing²

Affiliation:

1. Baidu, Inc., China and University of Macau, Macau, China

2. Baidu, Inc., Beijing, China

3. StylingAI Inc., Beijing, China

4. University of Macau, Macau, China

Abstract

Transfer learning through fine-tuning a pre-trained neural network with an extremely large dataset, such as ImageNet, can significantly improve and accelerate training while the accuracy is frequently bottlenecked by the limited dataset size of the new target task. To solve the problem, some regularization methods, constraining the outer layer weights of the target network using the starting point as references (SPAR), have been studied. In this article, we propose a novel regularized transfer learning framework

\operatorname{DELTA}

, namely DE ep L earning T ransfer using Feature Map with A ttention . Instead of constraining the weights of neural network,

\operatorname{DELTA}

aims at preserving the outer layer outputs of the source network. Specifically, in addition to minimizing the empirical loss,

\operatorname{DELTA}

aligns the outer layer outputs of two networks, through constraining a subset of feature maps that are precisely selected by attention that has been learned in a supervised learning manner. We evaluate

\operatorname{DELTA}

with the state-of-the-art algorithms, including

L^2

and

\emph {L}^2\text{-}SP

. The experiment results show that our method outperforms these baselines with higher accuracy for new tasks. Code has been made publicly available. 1

Funder

National Key Research and Development Program of China

Science and Technology Development Fund of Macau SAR

GuangDong Basic and Applied Basic Research Foundation

Key-Area Research and Development Program of Guangdong Province

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/3473912

Reference58 articles.

1. CNN-Based Joint Clustering and Representation Learning with Feature Drift Compensation for Large-Scale Image Data

2. Rating Image Aesthetics Using Deep Learning

3. Fast Learning-Based Single Image Super-Resolution

4. Fully Convolutional Network for Multiscale Temporal Action Proposals

Cited by 12 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Double-layer Stacked Gate Recurrent Unit with Self-Attention Residual Model for Knowledge Tracing;Proceedings of the 2024 Guangdong-Hong Kong-Macao Greater Bay Area International Conference on Education Digitalization and Computer Science;2024-07-26

2. Multi-receptive Field Distillation Network for seismic velocity model building;Engineering Applications of Artificial Intelligence;2024-07

3. Kidney Tumor Classification on CT images using Self-supervised Learning;Computers in Biology and Medicine;2024-06

4. An Optimal Edge-weighted Graph Semantic Correlation Framework for Multi-view Feature Representation Learning;ACM Transactions on Multimedia Computing, Communications, and Applications;2024-04-25

5. Ensuring cross-device portability of electromagnetic side-channel analysis for digital forensics;Forensic Science International: Digital Investigation;2024-03