Abstract
Inaccurate localization due to scale-variation during character detection causes a widespread issue overconfidence in results of the document analysis community, for the most part in historical and handwritten documents. In this work, we explored the performance of a state-of-the-art network with a simple pipeline that fast and accurately predicts handwritten Chinese characters in old documents. In order to adapt to locations of characters with multi-scale more precisely, excluding pre-processing and in-between steps, we utilized a network with multi-scale feature maps. Then, across each feature map, pre-selected boxes of unalike scales and aspect ratios were employed. The last step was to prune the bounding boxes, sending them to non-maximum suppression to yield the final results. Focusing on a well-designed neural network architecture and loss function that presents well-classified examples, we found our experiments on Caoshu, Character, and Src-images datasets demonstrated that detection performance was enhanced for the detection rate (DT), the false positive per character (FPPC), and the F-score in the order of 98.84%, 0.71, and 97.64%, respectively. In comparison with SSD (single-shot detector), the detection performance of a detection rate (DT), the false positive per character (FPPC), and the F-score were 61.12%, 6.12, and 60.33%, respectively.
Funder
National Research Foundation of Korea
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献