EXPERIMENTAL COMPARISON OF THE EFFECT OF ORDER IN RECURRENT NEURAL NETWORKS-Reference-Cited by-同舟云学术

EXPERIMENTAL COMPARISON OF THE EFFECT OF ORDER IN RECURRENT NEURAL NETWORKS

Published:1993-08 Issue:04 Volume:07 Page:849-872
ISSN:0218-0014
Container-title:International Journal of Pattern Recognition and Artificial Intelligence
language:en
Short-container-title:Int. J. Patt. Recogn. Artif. Intell.

Author:

MILLER CLIFFORD B.¹,GILES C. LEE²

Affiliation:

1. NEC Research Institute, 4 Independence Way, Princeton, NJ 08540, USA

2. Also: Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, USA

Abstract

There has been much interest in increasing the computational power of neural networks. In addition there has been much interest in “designing” neural networks better suited to particular problems. Increasing the “order” of the connectivity of a neural network permits both. Though order has played a significant role in feedforward neural networks, its role in dynamically driven recurrent networks is still being understood. This work explores the effect of order in learning grammars. We present an experimental comparison of first order and second order recurrent neural networks, as applied to the task of grammatical inference. We show that for the small grammars studied these two neural net architectures have comparable learning and generalization power, and that both are reasonably capable of extracting the correct finite state automata for the language in question. However, for a larger randomly-generated ten-state grammar, second order networks significantly outperformed the first order networks, both in convergence time and generalization capability. We show that these networks learn faster the more neurons they have (our experiments used up to 10 hidden neurons), but that the solutions found by smaller networks are usually of better quality (in terms of generalization performance after training). Second order nets have the advantage that they converge more quickly to a solution and can find it more reliably than first order nets, but that the second order solutions tend to be of poorer quality than those of the first order if both architectures are trained to the same error tolerance. Despite this, second order nets can more successfully extract finite state machines using heuristic clustering techniques applied to the internal state representations. We speculate that this may be due to restrictions on the ability of first order architecture to fully make use of its internal state representation power and that this may have implications for the performance of the two architectures when scaled up to larger problems.

Publisher

World Scientific Pub Co Pte Lt

Subject

Artificial Intelligence,Computer Vision and Pattern Recognition,Software

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0218001493000431

Cited by 36 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A tensor framework for learning in structured domains;Neurocomputing;2022-01

2. LSTM Based Model For Apple Inc Stock Price Forecasting;2021 2nd International Conference on Computer Science and Management Technology (ICCSMT);2021-11

3. Overcoming the Vanishing Gradient Problem during Learning Recurrent Neural Nets (RNN);Asian Journal of Applied Science and Engineering;2020-12-31

4. LSTM, GRU, Highway and a Bit of Attention: An Empirical Overview for Language Modeling in Speech Recognition;Interspeech 2016;2016-09-08

5. The Kernel Adaptive Autoregressive-Moving-Average Algorithm;IEEE Transactions on Neural Networks and Learning Systems;2016-02