Influence of pre-processing methods on the automatic priority prediction of native-language end-users’ maintenance requests through machine learning methods
-
Published:2024-03-15
Issue:
Volume:29
Page:
-
ISSN:1874-4753
-
Container-title:Journal of Information Technology in Construction
-
language:en
-
Short-container-title:ITcon
Author:
D’Orazio Marco,Bernardini Gabriele,Di Giuseppe Elisa
Abstract
Feedback and requests by occupants are relevant sources of data to improve building management, and building maintenance. Indeed, most predictable faults can be directly identified by occupants and communicated to facility managers through communications written in the end-users’ native language. In this sense, natural language processing methods can support the request identification and attribution process if they are robust enough to extract useful information from these unstructured textual sources. Machine learning (ML) can support assessing and managing these data, especially in the case of many simultaneous communications. In this field, the application of pre-processing and ML methods to English-written databases has been widely provided, while efforts in other native languages are still limited, impacting the real applicability. Moreover, the performance of combinations of methods for pre-processing, ML and classification classes attribution, has been limitedly investigated while comparing different languages. To fill this gap, this work hence explores the performance of automatic priority assignment of maintenance end-users’ requests depending on the combined influence of: (a) different natural language pre-processing methods, (b) several supervised ML algorithms, (c) two priority classification rules (2-class versus 4-class), (d) the database language (i.e. the original database written in Italian, the native end-users’ language; a translated database version in English, as standard reference). Analyses are performed on a database of about 12000 maintenance requests written in Italian concerning a stock of 23 buildings open to the public. A random sample of the sentences is supervised and labelled by 20 expert annotators following the best-worst method to attribute a priority score. Labelled sentences are then pre-processed using four different approaches to progressively reduce the number of unique words (potential predictors). Five different consolidated ML methods are applied, and comparisons involve accuracy, precision, recall and F1-score for each combination of pre-processing action, ML method and the number of priority classes. Results show that, within each ML algorithm, different pre-processing methods limitedly impact the final accuracy and average F1-score. In both Italian and English conditions, the best performance is obtained by NN, LR, SVM methods, while NB generally fails, and by considering the 2-class priority classification scale. In this sense, results confirm that facility managers can be effectively supported by ML methods for preliminary priority assessments in building maintenance processes, even when the requests database is written in end-users’ native language.
Publisher
International Council for Research and Innovation in Building and Construction
Reference59 articles.
1. Baek, S., Jung, W. and Han, S.H. (2021), “A critical review of text-based research in construction: Data source, analysis method, and implications”, Automation in Construction, Elsevier B.V., Vol. 132 No. August, p. 103915, doi: 10.1016/j.autcon.2021.103915. 2. Bellandi, V., Bernasconi, C., Lodi, F., Palmonari, M., Pozzi, R., Ripamonti, M. and Siccardi, S. (2024a), “An entity-centric approach to manage court judgments based on Natural Language Processing”, Computer Law & Security Review, Vol. 52, p. 105904, doi: 10.1016/j.clsr.2023.105904. 3. Bellandi, V., Bernasconi, C., Lodi, F., Palmonari, M., Pozzi, R., Ripamonti, M. and Siccardi, S. (2024b), “An entity-centric approach to manage court judgments based on Natural Language Processing”, Computer Law & Security Review, Vol. 52, p. 105904, doi: 10.1016/j.clsr.2023.105904. 4. Bortolini, R. and Forcada, N. (2020), “Analysis of building maintenance requests using a text mining approach: building services evaluation”, Building Research & Information, Routledge, Vol. 48 No. 2, pp. 207–217, doi: 10.1080/09613218.2019.1609291. 5. Bortoluzzi, B., Efremov, I., Medina, C., Sobieraj, D. and McArthur, J.J. (2019), “Automating the creation of building information models for existing buildings”, Automation in Construction, Elsevier, Vol. 105 No. August 2018, p. 102838, doi: 10.1016/j.autcon.2019.102838.
|
|