Abstract
The handwritten text recognition problem is widely studied by the researchers of computer vision community due to its scope of improvement and applicability to daily lives. It is a sub-domain of pattern recognition. Due to advancement of computational power of computers since last few decades neural networks based systems heavily contributed towards providing the state-of-the-art handwritten text recognizers. In the same direction, we have taken two state-of-the art neural networks systems and merged the attention mechanism with it. The attention technique has been widely used in the domain of neural machine translations and automatic speech recognition and now is being implemented in text recognition domain. In this study, we are able to achieve 4.15% character error rate and 9.72% word error rate on IAM dataset, 7.07% character error rate and 16.14% word error rate on GW dataset after merging the attention and word beam search decoder with existing Flor et al. architecture. To analyse further, we have also used system similar to Shi et al. neural network system with greedy decoder and observed 23.27% improvement in character error rate from the base model.
Publisher
Warsaw University of Life Sciences - SGGW Press
Subject
Computer Graphics and Computer-Aided Design,Computer Vision and Pattern Recognition,Software
Reference59 articles.
1. J. Almazan, A. Gordo, A. Fornes, and E. Valveny. Word spotting and recognition with embedded attributes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(12):2552-2566, 2014. https://doi.org/10.1109/TPAMI.2014.2339814.
2. D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. In Proc 3rd Int Conf Learning Representations, ICLR 2015, San Diego, CA, 7-9 May 2015. Accessible in arXiv. https://doi.org/10.48550/arXiv.1409.0473.
3. R. E. Bellman and S. E. Dreyfus. Applied Dynamic Programming, volume 2050 of Princeton Legacy Library. Princeton University Press, 2015. https://doi.org/10.1515/9781400874651.
4. A.-L. Bianne-Bernard, F. Menasri, Al-Hajj M. R., et al. Dynamic and contextual information in HMM modeling for handwritten word recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(10):2066-2080, 2011. https://doi.org/10.1109/TPAMI.2011.22.
5. T. Bluche. Joint line segmentation and transcription for end-to-end handwritten paragraph recognition. arXiv, 2016. arXiv:1604.08352. https://doi.org/10.48550/arXiv.1604.08352.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献