Burapha-TH: A Multi-Purpose Character, Digit, and Syllable Handwriting Dataset-Reference-Cited by-同舟云学术

Burapha-TH: A Multi-Purpose Character, Digit, and Syllable Handwriting Dataset

Published:2022-04-18 Issue:8 Volume:12 Page:4083
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Onuean Athita^ORCID,Buatoom Uraiwan,Charoenporn Thatsanee,Kim Taehong,Jung Hanmin^ORCID

Abstract

In handwriting recognition research, a public image dataset is necessary to evaluate algorithm correctness and runtime performance. Unfortunately, in existing Thai language script image datasets, there is a lack of variety of standard handwriting types. This paper focuses on a new offline Thai handwriting image dataset named Burapha-TH. The dataset has 68 character classes, 10 digit classes, and 320 syllable classes. For constructing the dataset, 1072 Thai native speakers wrote on collection datasheets that were then digitized using a 300 dpi scanner. De-skewing, detection box and segmentation algorithms were applied to the raw scans for image extraction. The experiment used different deep convolutional models with the proposed dataset. The result shows that the VGG-13 model (with batch normalization) achieved accuracy rates of 95.00%, 98.29%, and 96.16% on character, digit, and syllable classes, respectively. The Burapha-TH dataset, unlike all other known Thai handwriting datasets, retains existing noise, the white background, and all artifacts generated by scanning. This comprehensive, raw, and more realistic dataset will be helpful for a variety of research purposes in the future.

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/12/8/4083/pdf

Reference26 articles.

1. A Survey of OCR Applications

2. Handwritten Devanagari Character Recognition Using Layer-Wise Training of Deep Convolutional Neural Networks and Adaptive Gradient Methods

3. Improved Handwritten Digit Recognition Using Convolutional Neural Networks (CNN)

4. Handwritten Character Recognition Using Deep-Learning

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Review in Assamese Handwritten Character Recognition;Lecture Notes in Networks and Systems;2024

2. Thai Handwritten Character Recognition Using Deep Convolutional Neural Network;2023 8th International Conference on Computer and Communication Systems (ICCCS);2023-04-21

3. DeblurGAN-CNN: Effective Image Denoising and Recognition for Noisy Handwritten Characters;IEEE Access;2022