Affiliation:
1. KAYSERİ ÜNİVERSİTESİ, MÜHENDİSLİK, MİMARLIK VE TASARIM FAKÜLTESİ, BİLGİSAYAR MÜHENDİSLİĞİ BÖLÜMÜ
2. KAYSERİ ÜNİVERSİTESİ, LİSANSÜSTÜ EĞİTİM ENSTİTÜSÜ
Abstract
Crime refers to an action legally defined as harmful to society, and it is important to understand the type of crime to prevent these actions. However, crime can occur at any time and place, making it difficult to predict. Data generated based on previously committed crimes contributes to overcoming this difficulty. This study proposes a novel model for classifying criminal activities using a Doc2Vec that can cause a numerical representation of texts regardless of length and a stacking ensemble model that includes 8 different machine-learning models. Unlike the literature, the model processes the features as text and converts them into vectors rather than categorically. In this way, it enables using features that cannot be used in the literature. The proposed model is tested using a distributed online competition database, Francisco Crime Classification, which contains crimes committed over 12 years. An accuracy value of 99.28% was obtained for the 15 crime categories with the highest crime records, while precision, recall, and f-score values were 99.18%, 99.38%, and 99.20%, respectively. With cross-validation (k=10), 99.80% performance was achieved with a std. value of 0.001. These performance values are higher than those of all the studies in the literature using categorical feature structures. The results show that converting criminal activity reports, which contain text-based features, into vectors that can be processed with natural language processing techniques such as Doc2vec instead of using them categorically in model training can directly contribute to the classification performance and provide a more efficient model with less preprocessing.
Publisher
Cukurova Universitesi Muhendislik-Mimarlik Fakultesi Dergisi
Reference31 articles.
1. 1. İçli, T.G., 1993. Türkiye’de Suçlular (Sosyal Kültürel ve Ekonomik Özellikleri. Atatürk Kültür, Dil ve Tarih Kurumu Atatürk Kültür Merkezi Yayını, Ankara, 71.
2. 2. Hochstetler, J., Hochstetler, L., Fu, S., 2016. An Optimal Police Patrol Planning Strategy for Smart City Safety. IEEE 18th International Conference on High Performance Computing and Communications, Sydney, Australia, 1256-1263.
3. 3. Open Government, https://www.data.gov/open -gov/, Access date: Haziran 2023.
4. 4. Data.world Crime Datasets, https://data.world/ datasets/crime, Access date: Temmuz 2023.
5. 5. All Data Related to Crime And Justice, https://www.ons.gov.uk/peoplepopulationandcommunity/crimeandjustice/datalist?filter=datasets, Access date: Ağustos 2023.