Abstract
AbstractPredicting traffic accident duration is necessary for ensuring traffic safety. Several attempts have been made to achieve high prediction accuracy, but researchers have not considered traffic accident text data in much detail. The limited text data of the first report on an incident describes the characteristics of an accident that are initially available. This paper uses text data fusing and ensemble learning algorithms to build a model to predict an accident’s duration, and a preprocessing scheme of accident duration text data is established. Next, the random forest (RF) algorithm is applied to select feature variables of text data related to the traffic incident duration. Last, a text feature vector is introduced to models such as decision tree, k nearest neighbor, support vector regression, random forest, Gradient Boosting Decision Tree, and Xtreme Gradient Boosting. Our results show that the improved RF model has good prediction accuracy with RMSE, MAPE and R2. From this, the textual factors important to determining the duration of the accident are identified. Further, we investigated that the cumulative importance of 60% is sufficient for traffic accident prediction using text data. These results provide insights into minimizing traffic congestion related to accidents and contribute to the input optimization in text prediction.
Funder
National Natural Science Fund of China
the Opening Research Fund of the National Engineering Laboratory for Surface Transportation Weather Impact Prevention
Publisher
Springer Science and Business Media LLC
Reference29 articles.
1. Mohammed, Z. A., Abdullah, M. N. & Al-Hussaini, I. H. Review of the traffic incident duration prediction methods. J. Res. Sci. Eng. 2(6) (2020).
2. Zhang, Z., Liu, J., Li, X. & Khattak, A. J. Do larger sample sizes increase the reliability of traffic incident duration models? A case study of east Tennessee incidents. Transp. Res. Rec. 2675(6), 265–280 (2021).
3. Wali, B., Khattak, A. J. & Liu, J. Heterogeneity assessment in incident duration modelling: Implications for development of practical strategies for small & large scale incidents. J. Intell. Transp. Syst. https://doi.org/10.1080/15472450.2021.1944135 (2021).
4. Yuan, H. & Li, G. A survey of traffic prediction: From spatio-temporal data to intelligent transportation. Data Sci. Eng. 6, 63–85 (2021).
5. Nam, D. & Mannering, F. An exploratory hazard-based analysis of highway incident duration. Transp. Res. Part A 34(2), 85 (2000).