Comparison of decision tree and naïve Bayes algorithms in detecting trace residue of gasoline based on gas chromatography–mass spectrometry data

Author:

Md Ghazi Md Gezani Bin12,Chuen Lee Loong13,Samsudin Aznor S4,Sino Hukil1

Affiliation:

1. Universiti Kebangsaan Malaysia Forensic Science Program, CODTIS, Faculty of Health Science, , Selangor , Malaysia

2. Fire Investigation Division, Fire and Rescue Department of Malaysia , Putrajaya , Malaysia

3. Institute of IR 4.0, Universiti Kebangsaan Malaysia , Selangor , Malaysia

4. Fire Investigation Laboratory, Fire Investigation Division, Fire and Rescue Department of Selangor , Selangor , Malaysia

Abstract

Abstract Fire debris analysis aims to detect and identify any ignitable liquid residues in burnt residues collected at a fire scene. Typically, the burnt residues are analysed using gas chromatography–mass spectrometry (GC–MS) and are manually interpreted. The interpretation process can be laborious due to the complexity and high dimensionality of the GC–MS data. Therefore, this study aims to compare the potential of classification and regression tree (CART) and naïve Bayes (NB) algorithms in analysing the pixel-level GC–MS data of fire debris. The data comprise 14 positive (i.e. fire debris with traces of gasoline) and 24 negative (i.e. fire debris without traces of gasoline) samples. The differences between the positive and negative samples were first inspected based on the mean chromatograms and scores plots of the principal component analysis technique. Then, CART and NB algorithms were independently applied to the GC–MS data. Stratified random resampling was applied to prepare three sets of 200 pairs of training and testing samples (i.e. split ratio of 7:3, 8:2, and 9:1) for estimating the prediction accuracies. Although both the positive and negative samples were hardly differentiated based on the mean chromatograms and scores plots of principal component analysis, the respective NB and CART predictive models produced satisfactory performances with the normalized GC–MS data, i.e. majority achieved prediction accuracy >70%. NB consistently outperformed CART based on the prediction accuracies of testing samples and the corresponding risk of overfitting except when evaluated using only 10% of samples. The accuracy of CART was found to be inversely proportional to the number of testing samples; meanwhile, NB demonstrated rather consistent performances across the three split ratios. In conclusion, NB seems to be much better than CART based on the robustness against the number of testing samples and the consistent lower risk of overfitting.

Funder

CRIM, Universiti Kebangsaan Malaysia

Publisher

Oxford University Press (OUP)

Subject

Psychiatry and Mental health,Physical and Theoretical Chemistry,Anthropology,Biochemistry, Genetics and Molecular Biology (miscellaneous),Pathology and Forensic Medicine,Analytical Chemistry

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. AI-Driven Approaches to Reshape Forensic Practices;Cases on Forensic and Criminological Science for Criminal Detection and Avoidance;2024-05-17

2. Predicting prognosis outcomes of primary central nervous system lymphoma with high-dose methotrexate-based chemotherapeutic treatment using lipidomics;Neuro-Oncology Advances;2024-01-01

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3