Affiliation:
1. Dronacharya College of Engineering, Gurgaon, Haryana, India
Abstract
As the use of computers in our daily lives increases, so has the need for a natural procedure to interact with the computers. The ultimate aim of human computer interaction is to bring the change that there is always a natural way of interacting with computers coupled with ease and flexibility. Printed and textual media such as prescriptions, invoices, receipts, etc. occupies a large segment of our day-to-day activities and given their volume, it is inefficient to manage them physically as there’s always an associated risk of fading, damage, misplacing, etc. and hence a medium is required for their digital conversion. In this project, we have developed a robust, cross-platform web application that can process the images using PyTesseract based algorithms that can efficiently extract the textual data to facilitate the storage and retrieval of the same. The extracted text can be downloaded as a text file and can also be translated into the desired language. This is an active field of research and thus this paper also discusses various current implementations of the mentioned concept. The Optical Character Recognition framework finds applications in a variety of fields such as business process activities, number plate recognition, KYC and banking processes to name a few.
Reference14 articles.
1. https://www.forbes.com/sites/bernardmarr/2018/05/21/how-much-data-do-we-create-ev ery-day-the-mind-blowing-stats-everyone-should-read/?sh=77b6535560ba
2. Smith, R. (2007, September). An overview of the Tesseract OCR engine. In Ninth international conference on document analysis and recognition (ICDAR 2007) (Vol. 2, pp. 629-633). IEEE.
3. Jayoma, J. M., Moyon, E. S., & Morales, E. M. O. (2020, December). OCR Based Document Archiving and Indexing Using PyTesseract: A Record Management System for DSWD Caraga, Philippines. In 2020 IEEE 12th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM) (pp. 1-6). IEEE.
4. Pavithra, E., & Kumar, M. A. (2017). Portable Camera Based Text, Product Label and Currency Note Reading from the Hand Held Objects for Blind Person. Asian Journal of Applied Science and Technology (AJAST), 1(3), 66-69.
5. Rekha, M. (2021). Educational Training For Processing Invoice Of Vendor Identification And Payments Using Python-Tesseract. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 12(11), 224-228.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献