Affiliation:
1. Smart Transport Key Laboratory of Hunan Province, School of Transport and Transportation Engineering, Central South University , Changsha 410075 , China
2. Department of Computer Science, City University of Hong Kong , Kowloon, Hong Kong, China
3. Department of Geography and Resource Management, The Chinese University of Hong Kong , Shatin , NT, Hong Kong
Abstract
Abstract
Understanding the characteristics of time and distance gaps between the primary and second crashes is crucial for preventing secondary crash occurrences and improving road safety. Although previous studies have tried to analyze the variation of gaps, there is limited evidence in quantifying the relationships between different gaps and various influential factors. This study proposed a two-layer Stacking framework to discuss the time and distance gaps. Specifically, the framework took Random Forests, Gradient Boosting Decision Tree, and eXtreme Gradient Boosting as the base classifiers in the first layer and applied Logistic Regression as a combiner in the second layer. On this basis, the Local Interpretable Model-agnostic Explanations (LIME) technology was used to interpret the output of the Stacking model from both local and global perspectives. Through secondary crash identification and feature selection, 346 secondary crashes and 22 crash-related factors were collected from California interstate freeways. The results showed that the Stacking model outperformed base models evaluated by accuracy, precision, and recall indicators. The explanations based on LIME suggest that collision type, distance, speed, and volume are the critical features that affect the time and distance gaps. Higher volume can prolong queue length and increase the distance gap from the secondary to primary crashes. And collision types, peak periods, workday, truck involved, and tow away likely induce a long-distance gap. Conversely, there is a shorter distance gap when secondary roads run in the same direction and are close to the primary roads. Lower speed is a significant factor resulting in a long-time gap, while the higher speed is correlated with a short-time gap. These results are expected to provide insights into how contributory features affect the time and distance gaps and help decision-makers develop accurate decisions to prevent secondary crashes.
Publisher
Oxford University Press (OUP)
Subject
Engineering (miscellaneous),Safety, Risk, Reliability and Quality,Control and Systems Engineering