Affiliation:
1. Queensland University of Technology, Brisbane, Australia
Abstract
Through the application of process mining, business processes can be improved on the basis of process execution data captured in event logs. Naturally, the quality of this data determines the quality of the improvement recommendations. Improving data quality is non-trivial, and there is great potential to exploit unstructured text, e.g., from notes, reviews, and comments, for this purpose and to
enrich
event logs. To this end, this article introduces Text2EL
+
, a three-phase approach to enrich event logs using unstructured text. In its first phase, events and (case and event) attributes are derived from unstructured text linked to organisational processes. In its second phase, these events and attributes undergo a semantic and contextual validation before their incorporation in the event log. In its third and final phase, recognising the importance of human domain expertise, expert guidance is used to further improve data quality by removing redundant and irrelevant events. Expert input is used to train a Named Entity Recognition (NER) model with customised tags to detect event log elements. The approach applies natural language processing techniques, sentence embeddings, training pipelines and models, as well as contextual and expression validation. Various unstructured clinical notes associated with a healthcare case study were analysed, and completeness, concordance, and correctness of the derived event log elements were evaluated through experiments. The results show that the proposed method is feasible and applicable.
Publisher
Association for Computing Machinery (ACM)
Reference56 articles.
1. 33rd International Conference on Advanced Information Systems Engineering (CAiSE’21);Ackermann Lars,2021
2. Improving Pattern Detection in Healthcare Process Mining Using an Interval-Based Event Selection Method
3. Leveraging data quality to better prepare for process mining: An approach illustrated through analysing road trauma pre-hospital retrieval and transport processes in Queensland;Andrews Robert;Int. J. Environ. Res.,2019
4. Rolf Banziger, Artie Basukoski, and Thierry J. Chaussalet. 2018. Discovering business processes in CRM systems by leveraging unstructured text data. In 20th IEEE International Conference on High Performance Computing and Communications; 16th IEEE International Conference on Smart City; 4th IEEE International Conference on Data Science and Systems (HPCC/SmartCity/DSS’18). IEEE, 1571–1577.
5. Latent Dirichlet allocation;Blei David M.;J. Mach. Learn. Res.,2003