Analyzing the Effectiveness of Imbalanced Data Handling Techniques in Predicting Driver Phone Use

Author:

Taamneh Madhar M.1ORCID,Taamneh Salah2ORCID,Alomari Ahmad H.1ORCID,Abuaddous Musab1ORCID

Affiliation:

1. Department of Civil Engineering, Yarmouk University, P.O. Box 566, Irbid 21163, Jordan

2. Department of Computer Science and Applications, Faculty of Prince Al-Hussien Bin Abdullah for IT, The Hashemite University, P.O. Box 330127, Zarqa 13133, Jordan

Abstract

Distracted driving leads to a significant number of road crashes worldwide. Smartphone use is one of the most common causes of cognitive distraction among drivers. Available data on drivers’ phone use presents an invaluable opportunity to identify the main factors behind this behavior. Machine learning (ML) techniques are among the most effective techniques for this purpose. However, the potential and usefulness of these techniques are limited, due to the imbalance of available data. The majority class of instances collected is for drivers who do not use their phones, while the minority class is for those who do use their phones. This paper evaluates two main approaches for handling imbalanced datasets on driver phone use. These methods include oversampling and undersampling. The effectiveness of each method was evaluated using six ML techniques: Multilayer Perceptron (MLP), Support Vector Machine (SVM), Naive Bayes (NB), Bayesian Network (BayesNet), J48, and ID3. The proposed methods were also evaluated on three Deep Learning (DL) models: Arch1 (5 hidden layers), Arch2 (10 hidden layers), and Arch3 (15 hidden layers). The data used in this document were collected through a direct observation study to explore a set of human, vehicle, and road surface characteristics. The results showed that all ML methods, as well as DL methods, achieved balanced accuracy values for both classes. ID3, J48, and MLP methods outperformed the rest of the ML methods in all scenarios, with ID3 achieving slightly better accuracy. The DL methods also provided good performances, especially for the undersampling data. The results also showed that the classification methods performed best on the undersampled data. It was concluded that road classification has the highest impact on cell phone use, followed by driver age group, driver gender, vehicle type, and, finally, driver seatbelt usage.

Publisher

MDPI AG

Subject

Management, Monitoring, Policy and Law,Renewable Energy, Sustainability and the Environment,Geography, Planning and Development,Building and Construction

Reference28 articles.

1. World Health Organization (2015). WHO Report 2015: Data Tables, WHO.

2. World Health Organization (2023). Mobile Phone Use: A Growing Problem of Driver Distraction, WHO. Available online: https://www.who.int/publications/i/item/mobile-phone-use-a-growing-problem-of-driver-distraction.

3. Severity prediction of traffic accident using an artificial neural network;Alkheder;J. Forecast.,2017

4. An improved deep learning model for traffic crash prediction;Dong;J. Adv. Transp.,2018

5. Data-mining techniques for traffic accident modeling and prediction in the United Arab Emirates;Taamneh;J. Transp. Saf. Secur.,2017

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3