Xamã : Optical character recognition for multi-domain model management

Author:

Torres WeslleyORCID,van den Brand Mark G. J.,Serebrenik Alexander

Abstract

AbstractThe development of systems following model-driven engineering can include models from different domains. For example, to develop a mechatronic component one might need to combine expertise about mechanics, electronics, and software. Although these models belong to different domains, the changes in one model can affect other models causing inconsistencies in the entire system. Only few tools, however, support management of models from different domains. Indeed, these models are created using different modeling notations and it is not plausible to use a multitude of parsers geared toward each and every modeling notation. Therefore, to ensure maintenance of multi-domain systems, we need a uniform approach that would be independent from the peculiarities of the notation. Notation independence implies that such a uniform approach can only be based on elements commonly present in models of different domains, i.e., text, boxes, and lines. In this study, we investigate the suitability of optical character recognition (OCR) as a basis for such a uniformed approach. We select graphical models from various domains that typically combine textual and graphical elements. We start by analyzing the performance of Google Cloud Vision and Microsoft Cognitive Services, two off-the-shelf OCR services. Google Cloud Vision performed better than Microsoft Cognitive Services being able to detect text of 70% of model elements. Errors made by Google Cloud Vision are due to absence of support for text common in engineering formulas, e.g., Greek letters, equations, and subscripts. We identified the multi-line text error as one of the main issues of using OCR to recognize textual elements in models from different domains. This error happens when OCR misinterprets one textual element as two separate elements. To address the multi-line text error, we build Xamã on top of Google Cloud Vision. Xamã includes two approaches to identify whether the elements are positioned on a single line or multiple lines, and merge those identified as positioned on multiples lines. With and without shape detection, Xamã correctly identified 956 and 905 elements, respectively, out of 1171. Additionally, we compared the accuracy of Xamã and state-of-the-art tool img2UML, and we observe that Xamã outperformed img2UML in both precision and recall, being able to recognize 433 out of 614 textual elements as opposed to 171 by img2UML.

Funder

EU ECSEL

Publisher

Springer Science and Business Media LLC

Subject

Software

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. NEURAL-UML: Intelligent Recognition System of Structural Elements in UML Class Diagram;2023 ACM/IEEE International Conference on Model Driven Engineering Languages and Systems Companion (MODELS-C);2023-10-01

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3