Author:
Bulat Adrian,Kossaifi Jean,Tzimiropoulos Georgios,Pantic Maja
Abstract
The prominence of deep learning, large amount of annotated data and increasingly powerful hardware made it possible to reach remarkable performance for supervised classification tasks, in many cases saturating the training sets. However the resulting models are specialized to a single very specific task and domain. Adapting the learned classification to new domains is a hard problem due to at least three reasons: (1) the new domains and the tasks might be drastically different; (2) there might be very limited amount of annotated data on the new domain and (3) full training of a new model for each new task is prohibitive in terms of computation and memory, due to the sheer number of parameters of deep CNNs. In this paper, we present a method to learn new-domains and tasks incrementally, building on prior knowledge from already learned tasks and without catastrophic forgetting. We do so by jointly parametrizing weights across layers using low-rank Tucker structure. The core is task agnostic while a set of task specific factors are learnt on each new domain. We show that leveraging tensor structure enables better performance than simply using matrix operations. Joint tensor modelling also naturally leverages correlations across different layers. Compared with previous methods which have focused on adapting each layer separately, our approach results in more compact representations for each new task/domain. We apply the proposed method to the 10 datasets of the Visual Decathlon Challenge and show that our method offers on average about 7.5× reduction in number of parameters and competitive performance in terms of both classification accuracy and Decathlon score.
Publisher
Association for the Advancement of Artificial Intelligence (AAAI)
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Semi-asynchronous Federated Learning Optimized for NON-IID Data Communication based on Tensor Decomposition;2023 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom);2023-12-21
2. On the Effectiveness of LayerNorm Tuning for Continual Learning in Vision Transformers;2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW);2023-10-02
3. Effective Black Box Adversarial Attack with Handcrafted Kernels;Advances in Computational Intelligence;2023
4. ICFD: An Incremental Learning Method Based on Data Feature Distribution;2022 IEEE Smartworld, Ubiquitous Intelligence & Computing, Scalable Computing & Communications, Digital Twin, Privacy Computing, Metaverse, Autonomous & Trusted Vehicles (SmartWorld/UIC/ScalCom/DigitalTwin/PriComp/Meta);2022-12
5. Multi-Domain Incremental Learning for Semantic Segmentation;2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV);2022-01