Abstract
Background Machine vision faces significant challenges when applied to text recognition on cardboard packaging particularly due to multiple printing methods, irregular character shapes, and curved packaging surfaces. Methods This research introduces a novel deep learning application for recognizing binarized expiration date and batch code characters printed using multiple printing methods. The method, based on Region-based Convolutional Neural Networks (R-CNN), enables character recognition directly from in the images without the need for extracting handcrafted features. In detail, this approach performs character recognition by using the whole image as input, extracting and learning salient character features directly from the packaging surface images. Results The R-CNN model, with a precision of 91.1% and an F1 score of 80.9%, effectively recognizes manufacturing markings on pharmaceutical packages, with inconsistencies in the characters’ shapes. In a comparative experiment using the same dataset of images, the R-CNN model significantly outperformed Tesseract OCR, achieving much higher precision, recall, and F1 scores. Conclusions The results of this study reveal that the deep learning method outperforms the well-established optical character recognition method in recognizing text characters printed with different printing methods. Presented in this study, the deep learning method recognizes text characters with high precision. It is also suitable for recognizing text printed on curved surfaces, provided proper preprocessing is applied. The problem investigated in the study differs from previous research in the field, focusing on the recognition of texts printed with different printing methods. The research thus fills a gap in text recognition that existed in the research of the field. Furthermore, the study presents new ideas that will be utilized in our future research.
Reference16 articles.
1. A survey on Arabic optical character recognition and an isolated handwritten Arabic character recognition algorithm using encoded freeman chain code.;H Althobaiti;Paper presented at the 2017 51st Annual Conference on Information Sciences and Systems, CISS 2017.,2017
2. Recent advancements in machine vision methods for product code recognition: A systematic review.;J Koponen;F1000Res.,2022
3. Tesseract User Manual|tessdoc (tesseract-ocr.github.io).
4. A system on chip based serial number identification using computer vision. Paper presented at the 2016 IEEE International Conference on Recent Trends in Electronics, Information and Communication Technology.;R Mishra;RTEICT 2016-Proceedings.,2016
5. Product barcode and expiry date detection for the visually impaired using a smartphone. Paper presented at the 2012 International Conference on Digital Image Computing Techniques and Applications.;E Peng;DICTA,2012