Machine learning-based automated classification of worker-reported safety reports in construction-Reference-Cited by-同舟云学术

Machine learning-based automated classification of worker-reported safety reports in construction

Published:2022-11-14 Issue: Volume:27 Page:926-950
ISSN:1874-4753
Container-title:Journal of Information Technology in Construction
language:en
Short-container-title:ITcon

Author:

Bugalia Nikhil,Tarani Vurukuti,Kedia Jai,Gadekar Hrishikesh

Abstract

Limited academic attention has been paid to the applicability of Machine Learning (ML) approaches for analyzing worker-reported near-miss safety reports, as opposed to injury reports, at construction sites. Although resource-efficient analysis through ML of large volumes of such data at construction sites can help guide practitioners in decision-making to prevent injuries. The current study addresses this research gap by evaluating the relevance of ML approaches through quantitative and qualitative methods for scaling efficient near-miss reporting programs at construction sites. The study uses an extensive experimentation strategy consisting of input data processing, n-gram modeling, and sensitivity analysis. It first tests the proposition that, despite the data-quality challenges, the high performance of different ML algorithms can be achieved in automatically classifying the textual near-miss observations. The study relies on worker-reported near-miss data collected from a real construction site in Kuwait. The classification performance of various ML approaches is evaluated using F1 scores for three academically novel but commonly used category labels at the sites - "Unsafe Act (UA)," "Unsafe Condition (UC)," and "Good Observation (GO)." In addition, the practitioner's input was utilized to assess the practical applicability of ML classifiers for construction sites. The conventional Logistic Regression (LR) classifiers have a comparatively high F1 score of 0.79. However, ML classifiers faced challenges in distinguishing between UA and UC. Further, the analysis reveals that optimal ML classifiers may lose on being acceptable to human decision-makers. Overall, despite the promising performance of ML tools for the near-miss data, the sites with low maturity of reporting systems may find themselves unable to leverage ML to scale their reporting systems. A simplified experimentation strategy like the current study could help practitioners identify the data-specific optimal ML approaches in future applications.

Publisher

International Council for Research and Innovation in Building and Construction

Subject

Computer Science Applications,Building and Construction,Civil and Structural Engineering

Reference37 articles.

1. Auffray C. and Fu X. (2015). Chinese MNEs and managerial knowledge transfer in Africa: the case of the construction sector in Ghana. Journal of Chinese Economic and Business Studies. Vol. 13, No. 4, 285–310. https://doi.org/10.1080/14765284.2015.1092415

2. Baek S., Jung W. and Han S.H. (2021). A critical review of text based research in construction: Data source, analysis method, and implications. Automation in Construction. Vol. 132, 103915. https://doi.org/10.1016/j.autcon.2021.103915

3. Baker H., Hallowell M.R. and Tixier A.J.-P. (2020). Automatically learning construction injury precursors from text. Automation in Construction. Vol. 118, 103145. https://doi.org/10.1016/j.autcon.2020.103145

4. Bird S., Klein E. and Loper E. (2009). Natural language processing with Python: analyzing text with the natural language toolkit. O'Reilly Media, Inc. Sebastopol, CA, USA.

5. Bouckaert R.R. and Frank E. (2004). Evaluating the Replicability of Significance Tests for Comparing Learning Algorithms. In: Dai H., Srikant R. and Zhang, C. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2004. Lecture Notes in Computer Science, Vol. 3056. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24775-3_3

Cited by 10 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Evaluating external generalizability of machine learning models for recycled aggregate concrete property prediction;Journal of Cleaner Production;2024-09

2. Characterization of health and safety hazards of deconstruction activities;American Journal of Industrial Medicine;2024-08-27

3. Least Square Moment Balanced Machine: A New Approach To Estimating Cost To Completion For Construction Projects;Journal of Information Technology in Construction;2024-07-26

4. When grey model meets deep learning: A new hazard classification model;Information Sciences;2024-06

5. Influence of pre-processing methods on the automatic priority prediction of native-language end-users’ maintenance requests through machine learning methods;Journal of Information Technology in Construction;2024-03-15