1. To ensure a fair comparison, we construct ViT-S and ViT-C for the mini-ImageNet and CIFAR100 datasets, as they possess similar parameters to ResNet12 and ResNet18 respectively. For the CUB200 dataset, we utilize ViT-T and ViT-S, which are pretrained on ImageNet dataset 3 . The hidden sizes of ViT-S, ViT-C, and ViT-T are 384, 276, and 192, respectively. All of them have 12 layers and 12 heads;CIFAR100, and CUB200
2. Fewshot class-incremental learning;X Tao;Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2020