CL-BPUWM: continuous learning with Bayesian parameter updating and weight memory-Reference-Cited by-同舟云学术

CL-BPUWM: continuous learning with Bayesian parameter updating and weight memory

Published:2024-02-29 Issue:3 Volume:10 Page:3891-3906
ISSN:2199-4536
Container-title:Complex & Intelligent Systems
language:en
Short-container-title:Complex Intell. Syst.

Author:

He Yao,Yang Jing^ORCID,Li Shaobo,Hu Jianjun,Ren Yaping,Ji Qing

Abstract

AbstractCatastrophic forgetting in neural networks is a common problem, in which neural networks lose information from previous tasks after training on new tasks. Although adopting a regularization method that preferentially retains the parameters important to the previous task to avoid catastrophic forgetting has a positive effect; existing regularization methods cause the gradient to be near zero because the loss is at the local minimum. To solve this problem, we propose a new continuous learning method with Bayesian parameter updating and weight memory (CL-BPUWM). First, a parameter updating method based on the Bayes criterion is proposed to allow the neural network to gradually obtain new knowledge. The diagonal of the Fisher information matrix is then introduced to significantly minimize computation and increase parameter updating efficiency. Second, we suggest calculating the importance weight by observing how changes in each network parameter affect the model prediction output. In the process of model parameter updating, the Fisher information matrix and the sensitivity of the network are used as the quadratic penalty terms of the loss function. Finally, we apply dropout regularization to reduce model overfitting during training and to improve model generalizability. CL-BPUWM performs very well in continuous learning for classification tasks on CIFAR-100 dataset, CIFAR-10 dataset, and MNIST dataset. On CIFAR-100 dataset, it is 0.8%, 1.03% and 0.75% higher than the best performing regularization method (EWC) in three task partitions. On CIFAR-10 dataset, it is 2.25% higher than the regularization method (EWC) and 0.7% higher than the scaled method (GR). It is 0.66% higher than the regularization method (EWC) on the MNIST dataset. When the CL-BPUWM method was combined with the brain-inspired replay model under the CIFAR-100 and CIFAR-10 datasets, the classification accuracy was 2.35% and 5.38% higher than that of the baseline method, BI-R + SI.

Funder

Project supported by the national natural science foundation of China

Science and Technology Program of Guizhou Province

Developing objects and projects of scientific and technological talents in Guiyang city

Joint Open Fund Project of Key Laboratories of the Ministry of Education

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s40747-024-01350-1.pdf

Reference57 articles.

1. Song X, Wu N, Song S et al (2023) Switching-like event-triggered state estimation for reaction-diffusion neural networks against DoS attacks. Neural Process Lett 10:1–22. https://doi.org/10.1007/s11063-023-11189-1

2. Peng Z, Song X, Song S et al (2023) Hysteresis quantified control for switched reaction–diffusion systems and its application. Complex Intell Syst. https://doi.org/10.1007/s40747-023-01135-y

3. Song X, Wu N, Song S et al (2023) Bipartite synchronization for cooperative-competitive neural networks with reaction–diffusion terms via dual event-triggered mechanism. Neurocomputing 550:126498. https://doi.org/10.1016/j.neucom.2023.126498

4. Gong X, Xia X, Zhu W, et al (2021) Deformable Gabor feature networks for biomedical image classification. In: Proceedings of the IEEE/CVF Winter Conference on applications of computer vision, pp 4004–4012. https://doi.org/10.1109/wacv48630.2021.00405

5. Shih H, Cheng H, Fu J (2019) Image classification using synchronized rotation local ternary pattern. IEEE Sens J 20(3):1656–1663. https://doi.org/10.1109/JSEN.2019.2947994