Affiliation:
1. School of Computer Science and Technology, Shandong University of Technology, Zibo 255049, China
Abstract
Process discovery aims to discover process models from event logs to describe actual business processes. The quality of event logs has an impact on the quality of process models, so preprocessing methods can be used to improve the quality of event logs. Chaotic activities may exist in real business scenarios, and the occurrence of chaotic activities is independent of other activities in the process and can occur at any location in the event log at any frequency. Therefore, chaotic activities seriously affect the model quality of process discovery. Filtering chaotic activities in event logs can effectively improve the quality of event logs and thus improve the quality of process models. The traditional chaotic activity filtering algorithm makes it difficult to balance accuracy and time performance. Therefore, a direct method for filtering chaotic activities is proposed in this paper. By analyzing the relationship between activities, chaotic activities are identified in the log according to the characteristics of chaotic activities and the direct following relationship of activities as the judgment condition, and the filtering of chaotic activities in the event log is realized. In addition, this paper proposes an indirect chaotic activity filtering method, which identifies and filters chaotic activities in the log by analyzing the influence of the existence of different activities on the overall chaos degree of the log. The proposed method is compared with the traditional chaotic activity filtering method on several simulation/real data sets, and the accuracy and running time between the multi-group event logs and the process models generated before and after chaotic activity filtering are analyzed, further verifying the effectiveness and feasibility of the proposed method. By summarizing the experimental results, it is found that the accuracy of the proposed chaotic activity filtering methods is greater than that of the frequency-based filtering method and is close to that of the entropy-based chaotic activity filtering methods. Moreover, compared with other filtering methods used in the experiment, the chaotic activity filtering method proposed in this paper can improve the efficiency by 23.4% on average for simulation logs, and by 84.25% on average for real event logs. It is concluded that compared with other filtering methods, the proposed chaotic activity filtering methods have higher accuracy and can effectively improve the time performance of chaotic activity filtering. Therefore, the chaotic activity filtering method proposed in this paper can balance the accuracy and time performance, and can ensure the integrity of the filtered event log to a certain extent.
Funder
Shandong Provincial Undergraduate Teaching Reform Project
National College Students’ Innovation and Entrepreneurship Training Program
Shandong Provincial Natural Science Foundation of P.R. China
Shandong University of Technology Postgraduate Teaching Reform Project
Reference24 articles.
1. Reinkemeyer, L. (2020). Process Mining in Action: Principles, Use Cases and Outlook, Springer International Publishing.
2. Discovering more precise process models from event logs by filtering out chaotic activities;Tax;J. Intell. Inf. Syst.,2019
3. Filtering Out Infrequent Behaviour from Business Process Event Logs;Conforti;IEEE Trans. Knowl. Data Eng.,2017
4. Lu, X., Fahland, D., van den Biggelaar, F.J.H.M., and van der Aalst, W.M.P. (September, January 31). Detecting deviating behaviors without models. Proceedings of the International Workshop on Business Process Intelligence, Innsbruck, Austria.
5. Sani, M.F., van Zelst, S.J., and van der Aalst, W.M.P. (2018, January 10–11). Improving process discovery results by filtering outliers using conditional behavioural probabilities. Proceedings of the International Workshop on Business Process Intelligence, Barcelona, Spain.