On Stage-Wise Backpropagation for Improving Cheng’s Method for Fully Connected Cascade Networks-Reference-Cited by-同舟云学术

On Stage-Wise Backpropagation for Improving Cheng’s Method for Fully Connected Cascade Networks

Published:2024-07-11 Issue:4 Volume:56 Page:
ISSN:1573-773X
Container-title:Neural Processing Letters
language:en
Short-container-title:Neural Process Lett

Author:

Mizutani Eiji,Kubota Naoyuki,Truong Tam Chi

Abstract

AbstractIn this journal, Cheng has proposed a backpropagation (BP) procedure called BPFCC for deep fully connected cascaded (FCC) neural network learning in comparison with a neuron-by-neuron (NBN) algorithm of Wilamowski and Yu. Both BPFCC and NBN are designed to implement the Levenberg-Marquardt method, which requires an efficient evaluation of the Gauss-Newton (approximate Hessian) matrix

$$\nabla \textbf{r}^\textsf{T} \nabla \textbf{r}$$

∇ r T ∇ r , the cross product of the Jacobian matrix

$$\nabla \textbf{r}$$

∇ r of the residual vector

$$\textbf{r}$$

r in nonlinear least squares sense. Here, the dominant cost is to form

$$\nabla \textbf{r}^\textsf{T} \nabla \textbf{r}$$

∇ r T ∇ r by rank updates on each data pattern. Notably, NBN is better than BPFCC for the multiple

$$q~\!(>\!1)$$

q ( > 1 ) -output FCC-learning when q rows (per pattern) of the Jacobian matrix

$$\nabla \textbf{r}$$

∇ r are evaluated; however, the dominant cost (for rank updates) is common to both BPFCC and NBN. The purpose of this paper is to present a new more efficient stage-wise BP procedure (for q-output FCC-learning) that reduces the dominant cost with no rows of

$$\nabla \textbf{r}$$

∇ r explicitly evaluated, just as standard BP evaluates the gradient vector

$$\nabla \textbf{r}^\textsf{T} \textbf{r}$$

∇ r T r with no explicit evaluation of any rows of the Jacobian matrix

$$\nabla \textbf{r}$$

∇ r .

Funder

Ministry of Science and Technology, Taiwan

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s11063-024-11655-4.pdf

Reference47 articles.

1. Beale MH, Hagan MT, Demuth HB (2014) “calcjejj.m,” a script file in the Matlab Neural Network Toolbox, The MathWorks, Inc., Version 8.2

2. Bryson AE (1961) A gradient method for optimizing multi-stage allocation processes. In: Proceedings of Harvard University symposium on digital computers and their applications, pp 125–135

3. Cheng Y (2017) Backpropagation for fully connected cascade networks. Neural Process Lett 46:293–311

4. Conn AR, Gould NIM, Toint PL (2000) Trust-Region Methods, SIAM

5. Demmel JW (1997) Applied numerical linear algebra. SIAM