On the Black-Box Challenge for Fraud Detection Using Machine Learning (I): Linear Models and Informative Feature Selection

Author:

Chaquet-Ulldemolins JacoboORCID,Gimeno-Blanes Francisco-JavierORCID,Moral-Rubio SantiagoORCID,Muñoz-Romero SergioORCID,Rojo-Álvarez José-LuisORCID

Abstract

Artificial intelligence (AI) is rapidly shaping the global financial market and its services due to the great competence that it has shown for analysis and modeling in many disciplines. What is especially remarkable is the potential that these techniques could offer to the challenging reality of credit fraud detection (CFD); but it is not easy, even for financial institutions, to keep in strict compliance with non-discriminatory and data protection regulations while extracting all the potential that these powerful new tools can provide to them. This reality effectively restricts nearly all possible AI applications to simple and easy to trace neural networks, preventing more advanced and modern techniques from being applied. The aim of this work was to create a reliable, unbiased, and interpretable methodology to automatically evaluate CFD risk. Therefore, we propose a novel methodology to address the mentioned complexity when applying machine learning (ML) to the CFD problem that uses state-of-the-art algorithms capable of quantifying the information of the variables and their relationships. This approach offers a new form of interpretability to cope with this multifaceted situation. Applied first is a recent published feature selection technique, the informative variable identifier (IVI), which is capable of distinguishing among informative, redundant, and noisy variables. Second, a set of innovative recurrent filters defined in this work are applied, which aim to minimize the training-data bias, namely, the recurrent feature filter (RFF) and the maximally-informative feature filter (MIFF). Finally, the output is classified by using compelling ML techniques, such as gradient boosting, support vector machine, linear discriminant analysis, and linear regression. These defined models were applied both to a synthetic database, for better descriptive modeling and fine tuning, and then to a real database. Our results confirm that our proposal yields valuable interpretability by identifying the informative features’ weights that link original variables with final objectives. Informative features were living beyond one’s means, lack or absence of a transaction trail, and unexpected overdrafts, which are consistent with other published works. Furthermore, we obtained 76% accuracy in CFD, which represents an improvement of more than 4% in the real databases compared to other published works. We conclude that with the use of the presented methodology, we do not only reduce dimensionality, but also improve the accuracy, and trace relationships among input and output features, bringing transparency to the ML reasoning process. The results obtained here were used as a starting point for the companion paper which reports on our extending the interpretability to nonlinear ML architectures.

Funder

Agencia Estatal de Investigación of Science and Innovation Ministry

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Reference38 articles.

1. Credit Card Fraud Detection using Machine Learning Algorithms

2. Artificial Intelligence in Finance;Buchanan,2019

3. An Interpretable Model with Globally Consistent Explanations for Credit Risk;Chen;arXiv,2018

4. Credit Card Fraud Detection using Deep Learning based on Auto-Encoder and Restricted Boltzmann Machine

Cited by 9 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Explainable artificial intelligence (XAI) in finance: a systematic literature review;Artificial Intelligence Review;2024-07-26

2. Optimal Weight-Tuning for Unbalanced Data in Credit Card Fraud Detection;2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS);2024-04-18

3. Artificial neural network to predict post-operative hypocalcemia following total thyroidectomy;Indian Journal of Otolaryngology and Head & Neck Surgery;2024-03-25

4. Utilizing GANs for Credit Card Fraud Detection: A Comparison of Supervised Learning Algorithms;Engineering, Technology & Applied Science Research;2023-12-05

5. A Machine Learning Method with Hybrid Feature Selection for Improved Credit Card Fraud Detection;Applied Sciences;2023-06-18

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3