Affiliation:
1. Vellore Institute of Technology, Chennai, India
Abstract
Long Short-Term Memory (LSTM) is a specific kind of recurrent neural network (RNN) structure that addresses the constraints of conventional RNNs in effectively capturing and learning long-term relationships in sequential input. In this chapter, the authors examine the LSTM cell and its modifications to investigate the LSTM cell's capability for learning. Furthermore, future study prospects for LSTM networks are outlined. LSTM networks have gotten extensive attention in scientific papers, technical websites, and deployment manuals because of their efficacy in a variety of practical situations. Gradient-based learning techniques used in RNNs are too slow because as the error is transmitted back, it disappears, resulting in a much more extended learning period. LSTMs handle the issue with a novel additive gradient design that incorporates direct access towards the forget gate's activations, allowing the network to promote desirable behavior from the error gradient by updating the gates often at each time step of learning.