A Fine-Tuned Hybrid Stacked CNN to Improve Bengali Handwritten Digit Recognition-Reference-Cited by-同舟云学术

A Fine-Tuned Hybrid Stacked CNN to Improve Bengali Handwritten Digit Recognition

Published:2023-08-04 Issue:15 Volume:12 Page:3337
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Amin Ruhul¹^ORCID,Reza Md. Shamim¹^ORCID,Okuyama Yuichi²^ORCID,Tomioka Yoichi²^ORCID,Shin Jungpil²^ORCID

Affiliation:

1. Department of Statistics, Pabna University of Science and Technology, Pabna 6600, Bangladesh

2. School of Computer Science and Engineering, The University of Aizu, Aizuwakamatsu 965-8580, Fukushima, Japan

Abstract

Recognition of Bengali handwritten digits has several unique challenges, including the variation in writing styles, the different shapes and sizes of digits, the varying levels of noise, and the distortion in the images. Despite significant improvements, there is still room for further improvement in the recognition rate. By building datasets and developing models, researchers can advance state-of-the-art support, which can have important implications for various domains. In this paper, we introduce a new dataset of 5440 handwritten Bengali digit images acquired from a Bangladeshi University that is now publicly available. Both conventional machine learning and CNN models were used to evaluate the task. To begin, we scrutinized the results of the ML model used after integrating three image feature descriptors, namely Binary Pattern (LBP), Complete Local Binary Pattern (CLBP), and Histogram of Oriented Gradients (HOG), using principal component analysis (PCA), which explained 95% of the variation in these descriptors. Then, via a fine-tuning approach, we designed three customized CNN models and their stack to recognize Bengali handwritten digits. On handcrafted image features, the XGBoost classifier achieved the best accuracy at 85.29%, an ROC AUC score of 98.67%, and precision, recall, and F1 scores ranging from 85.08% to 85.18%, indicating that there was still room for improvement. On our own data, the proposed customized CNN models and their stack model surpassed all other models, reaching a 99.66% training accuracy and a 97.57% testing accuracy. In addition, to robustify our proposed CNN model, we used another dataset of Bengali handwritten digits obtained from the Kaggle repository. Our stack CNN model provided remarkable performance. It obtained a training accuracy of 99.26% and an almost equally remarkable testing accuracy of 96.14%. Without any rigorous image preprocessing, fewer epochs, and less computation time, our proposed CNN model performed the best and proved the most resilient throughout all of the datasets, which solidified its position at the forefront of the field.

Funder

Competitive Research Fund of the University of Aizu, Japan

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering

Link

https://www.mdpi.com/2079-9292/12/15/3337/pdf

Reference50 articles.

1. Handwritten digit recognition using convolutional neural networks;Jana;Deep. Learn. Res. Appl.,2020

2. Programmable soft-matter electronics;Ivanov;J. Phys. Chem. Lett.,2021

3. A review of physics-based machine learning in civil engineering;Vadyala;Results Eng.,2022

4. Prediction of chronic liver disease patients using integrated projection-based statistical feature extraction with machine learning algorithms;Amin;Inform. Med. Unlocked,2023

5. Deep learning in computer vision: A critical review of emerging techniques and application scenarios;Chai;Mach. Learn. Appl.,2021

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Overview of Pest Detection and Recognition Algorithms;Electronics;2024-07-30