A Fine-Tuned Hybrid Stacked CNN to Improve Bengali Handwritten Digit Recognition

Author:

Amin Ruhul1ORCID,Reza Md. Shamim1ORCID,Okuyama Yuichi2ORCID,Tomioka Yoichi2ORCID,Shin Jungpil2ORCID

Affiliation:

1. Department of Statistics, Pabna University of Science and Technology, Pabna 6600, Bangladesh

2. School of Computer Science and Engineering, The University of Aizu, Aizuwakamatsu 965-8580, Fukushima, Japan

Abstract

Recognition of Bengali handwritten digits has several unique challenges, including the variation in writing styles, the different shapes and sizes of digits, the varying levels of noise, and the distortion in the images. Despite significant improvements, there is still room for further improvement in the recognition rate. By building datasets and developing models, researchers can advance state-of-the-art support, which can have important implications for various domains. In this paper, we introduce a new dataset of 5440 handwritten Bengali digit images acquired from a Bangladeshi University that is now publicly available. Both conventional machine learning and CNN models were used to evaluate the task. To begin, we scrutinized the results of the ML model used after integrating three image feature descriptors, namely Binary Pattern (LBP), Complete Local Binary Pattern (CLBP), and Histogram of Oriented Gradients (HOG), using principal component analysis (PCA), which explained 95% of the variation in these descriptors. Then, via a fine-tuning approach, we designed three customized CNN models and their stack to recognize Bengali handwritten digits. On handcrafted image features, the XGBoost classifier achieved the best accuracy at 85.29%, an ROC AUC score of 98.67%, and precision, recall, and F1 scores ranging from 85.08% to 85.18%, indicating that there was still room for improvement. On our own data, the proposed customized CNN models and their stack model surpassed all other models, reaching a 99.66% training accuracy and a 97.57% testing accuracy. In addition, to robustify our proposed CNN model, we used another dataset of Bengali handwritten digits obtained from the Kaggle repository. Our stack CNN model provided remarkable performance. It obtained a training accuracy of 99.26% and an almost equally remarkable testing accuracy of 96.14%. Without any rigorous image preprocessing, fewer epochs, and less computation time, our proposed CNN model performed the best and proved the most resilient throughout all of the datasets, which solidified its position at the forefront of the field.

Funder

Competitive Research Fund of the University of Aizu, Japan

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering

Reference50 articles.

1. Handwritten digit recognition using convolutional neural networks;Jana;Deep. Learn. Res. Appl.,2020

2. Programmable soft-matter electronics;Ivanov;J. Phys. Chem. Lett.,2021

3. A review of physics-based machine learning in civil engineering;Vadyala;Results Eng.,2022

4. Prediction of chronic liver disease patients using integrated projection-based statistical feature extraction with machine learning algorithms;Amin;Inform. Med. Unlocked,2023

5. Deep learning in computer vision: A critical review of emerging techniques and application scenarios;Chai;Mach. Learn. Appl.,2021

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3