1. D. Arpit, B. Kanuparthi, G. Kerg, N.R. Ke, I. Mitliagkas, Y. Bengio, H-detach: Modifying the LSTM gradient towards better optimization, 2019, arXiv. Doi:10.48550/ARXIV.1810.03023.
2. J.L. Ba, J.R. Kiros, G.E. Hinton, Layer normalization, 2016. arXiv preprint arXiv:1607.06450.
3. Cervical cytology classification using PCA and GWO enhanced deep features selection;Basak;SN Comput. Sci.,2021
4. (a) S. Bharati, P. Podder, M.R.H. Mondal, Hybrid deep learning for detecting lung diseases from x-ray images. Informatics in Medicine Unlocked, 20 (2020) 100391. ISSN 2352-9148. doi: 10.1016/j.imu.2020.100391.
5. (b) T.B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al., Language models are few-shot learners, 2020. ArXiv., https://doi.org/10.48550/arXiv.2005.14165.