Abstract
Abstract
This work focuses on crime status prediction through an ensemble methodology applied to extensive datasets obtained from catalog.data.gov, specifically targeting Los Angeles crime incidents since 2020. The research methodology comprises meticulous data collection, rigorous preprocessing, exploratory data analysis, model selection, and comprehensive model evaluation. Initial challenges included data inaccuracies and privacy-preserving measures in location data, necessitating thorough cleaning and transformation processes. Exploratory data analysis revealed crucial insights, including the 'Status' attribute's limited correlation, crime code distributions, areawise crime counts, and temporal patterns. To address class imbalance within 'Status', the synthetic minority oversampling technique (SMOTE) was applied to balance the dataset. Model evaluation highlighted the superiority of random forest models employing 10 and 20 decision trees, alongside KNN, which demonstrated consistent high accuracy, balanced precision-recall trade-offs, and notable F1 scores in crime status prediction.
Publisher
Research Square Platform LLC
Reference8 articles.
1. Babakura, M. N. Sulaiman and M. A. Yusuf(2014) Improved method of classification algorithms for crime prediction International Symposium on Biometrics and Security Technologies (ISBAST) Kuala Lumpur, Malaysia, 2014, pp. 250–255, doi: 10.1109/ISBAST.2014.7013130.
2. Using machine learning algorithms to analyze crime data;McClendon L;Machine Learning and Applications: An International Journal (MLAIJ),2015
3. Detecting crime types usingclassification algorithm;Sun CC;J Digit Inf Manag,2014
4. Classification of Criminal Data Using J48 Algorithm International Journal of Data warehousing and Mining;Sakhare N,2014
5. Wu, S., Wang, C., Cao, H., Jia, X. (2020). Crime Prediction Using Data Mining and Machine Learning. In: Liu, Q., Mısır, M., Wang, X., Liu, W. (eds) The 8th International Conference on Computer Engineering and Networks (CENet2018). CENet2018 2018. Advances in Intelligent Systems and Computing, vol 905. Springer, Cham. https://doi.org/10.1007/978-3-030-14680-1_40.