Analyzing Freeway Safety Influencing Factors Using the CatBoost Model and Interpretable Machine-Learning Framework, SHAP

Author:

Li Jiaqi1,Wang Xuesong1ORCID,Yang Xiaohan2,Zhang Qi3,Pan Hanzhong4

Affiliation:

1. School of Transportation Engineering, The Key Laboratory of Road and Traffic Engineering, Ministry of Education, Tongji University, Shanghai, China

2. School of Mathematics Science, Tongji University, Shanghai, China

3. School of Transportation Engineering, Tongji University, Shanghai, China

4. Traffic Management Research Institute of the Ministry of Public Security, Wuxi City, Jiangsu Province, China

Abstract

Exploring and analyzing safety influencing factors can guide targeted traffic safety management. Traditional traffic safety models are aimed at specific data problems and making adjustments to the model structure, which lack focus on predictive ability and have limited information on the analysis of influencing factors. In recent years, machine-learning methods have opened new avenues in modeling that have higher prediction accuracy, can identify complex nonlinear relationships, and can overcome over- and under-dispersion and correlation. Machine-learning methods, however, pose the problem of limited interpretability. The interpretable machine-learning framework SHAP can be an effective solution, which can not only reflect the influence of features in each sample but also generate global interpretation. This study established gradient boosting models including the CatBoost and XGBoost models as traffic safety models, which were compared with a traditional NB regression model and a zero-inflated negative binomial regression model. SHAP was used to analyze several safety influencing factors, including geometric design features, traffic operation characteristics, time of day, and land use. Results confirmed that the CatBoost model has better prediction ability and is a more suitable traffic safety model than the traditional negative binomial regression model. Among the key findings were that ramp type is the most important factor in freeway crash frequency; curve presence has a great positive impact, while truck proportion has a great negative impact; and traffic volume is highly correlated with truck proportion. These findings can provide theoretical support for safety operation management and targeted improvement measures for freeways.

Publisher

SAGE Publications

Subject

Mechanical Engineering,Civil and Structural Engineering

Reference33 articles.

1. National Development and Reform Commission. National Highway Network Planning (2013-2030). 2013. https://zfxxgk.ndrc.gov.cn/web/iteminfo.jsp?id=285

2. Traffic Administration Bureau of the Ministry of Public Security of the People’s Republic of China. Annual Statistical Report of Road Traffic Accidents of the People’s Republic of China (2019). Traffic Management Research Institute of the Ministry of Public Security, Wuxi City, Jiangsu Province, China, 2020.

3. The statistical analysis of crash-frequency data: A review and assessment of methodological alternatives

4. Developing crash prediction models using parametric and nonparametric approaches for rural mountainous freeways: A case study on Wyoming Interstate 80

5. Applying a Bayesian multivariate spatio-temporal interaction model based approach to rank sites with promise using severity-weighted decision parameters

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3