Affiliation:
1. Concordia Institute for Information Systems Engineering Concordia University Montreal Canada
Abstract
AbstractSingle‐stage activity recognition methods have been gaining popularity within the construction domain. However, their low per‐frame accuracy necessitates additional post‐processing to link the per‐frame detections. Therefore, limiting their real‐time monitoring capabilities is an indispensable component of the emerging construction of digital twins. This study proposes knowledge DIstillation of temporal Gradient data for construction Entity activity Recognition (DIGER), built upon the you only watch once (YOWO) method and improving its activity recognition and localization performance. Activity recognition is improved by designing an auxiliary backbone to exploit the complementary information in the temporal gradient data (transferred into YOWO using knowledge distillation), while localization is improved primarily through integration of complete intersection over union loss. DIGER achieved a per‐frame activity recognition accuracy of 93.6% and localization mean average precision at 50% of 79.8% on a large custom dataset, outperforming state‐of‐the‐art methods without requiring additional computation during inference, making it highly effective for real‐time monitoring of construction site activities.
Subject
Computational Theory and Mathematics,Computer Graphics and Computer-Aided Design,Computer Science Applications,Civil and Structural Engineering,Building and Construction