Author:
Zhang Haoruo,Cheng Zhongquan
Abstract
Abstract
Optical Character Recognition (OCR) technology can quickly convert text and digital information in pictures into text information, which has been widely used in actual scenes. However, in some complex scenes, such as different light, angle or occlusion, the existing OCR systems still cannot reach the required accuracy. This paper proposes an advanced pyramid network structure, which uses multiple different scales and parallel pyramid network structures to deal with the problem of different character scales and misalignment. Each pyramid network structure has at least four convolutional layers. At the same time, in each pyramid network connection, the proportionally discarding parameters is introduced to increase the calculation speed further. The advantage of this network structure is very robust to more differently sized characters and increases the number of valid parameters obtained through training. Experiments on open source data sets show that the method has better recognition accuracy and speed.
Subject
General Physics and Astronomy
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献