Procedure Parsing: A Method for Parsing Handwritten Documents into Computer-Based Procedures

Author:

Whitmore Stacey

Abstract

The nuclear industry is heavily procedure driven, where almost everything has a step-by-step instruction that is expected to be followed in detail. Historically, these procedures were printed on paper copies. Recently, the industry transitioned towards electronic copies (i.e., PDFs on tablets). One major drive for this transition is the introduction of human error and loss of situation awareness when using paper copies. However, electronic copies of documents inherently have the same error traps as their paper cousins. Therefore, there is an increased interest in a way to utilize the information in the step-by-step guidance, but to present it in a dynamic manner that guides the user and adapts to any encountered conditions. Researchers at Idaho National Laboratory propose a flexible, automated method based on document parsing and augmented by natural language processing (NLP) techniques, to address these shortcomings and capitalize on these recent advancements in machine learning. The proposed method provides a cost-effective solution for computer-assisted procedure parsing of hand-written control room procedures, originally authored in Word or PDF formats, into instructions that can be displayed as computer-based procedures (CBP) in a modern graphical user interface. The researchers devised, implemented and demonstrated the Operating Procedure Extender for Novel Systems (OPENS) method in 2020. The key to OPENS is to map the original procedure text into a context-free grammar, tying content to equipment, locations, and other steps, actions, etc. This formal grammar is then used to isolate and define keywords and actions verbs, such as “measure” or “evaluate” and tie them to specific equipment referenced within that step or located in other steps, substeps, actions, subactions and tables throughout the procedure. OPENS generates an abstract syntax tree from the document which it uses to store a copy of this information in the open-standard, machine-readable and human-readable file formats XML and JSON. The XML is useful to preserve the relational aspects of the procedure for referencing tables and branching information so the user can be directed to the next appropriate active step based on the values entered for that step and previous steps. The JSON is useful for storing and exchanging data objects used to track responses to previous steps and state changes in simulated environments. In future iterations, these formats can also be used for storing more detailed information about input during plant operation or simulation. The techniques the researcher developed could further be improved by integration of recent advancements in machine learning. NLP methods could standardize documents, correct for grammatical error, and provide automated semantic validation. The researcher expects that self-supervised techniques applied to collections of natural language instructions could strengthen the model with broader context. All these methods together give us a practical way to automatically extract protocols from documents and user interactions, empowering researchers, procedure writers and nuclear operators while moving the industry forward.

Publisher

AHFE International

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Parsing of Research Documents into XML Using Formal Grammars;Applied Computational Intelligence and Soft Computing;2024-01

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3