Author:
Cantú Francisco,Saiegh Sebastián M.
Abstract
In this paper, we introduce an innovative method to diagnose electoral fraud using vote counts. Specifically, we use synthetic data to develop and train a fraud detection prototype. We employ a naive Bayes classifier as our learning algorithm and rely on digital analysis to identify the features that are most informative about class distinctions. To evaluate the detection capability of the classifier, we use authentic data drawn from a novel data set of district-level vote counts in the province of Buenos Aires (Argentina) between 1931 and 1941, a period with a checkered history of fraud. Our results corroborate the validity of our approach: The elections considered to be irregular (legitimate) by most historical accounts are unambiguously classified as fraudulent (clean) by thelearner. More generally, our findings demonstrate the feasibility of generating and using synthetic data for training and testing an electoral fraud detection system.
Publisher
Cambridge University Press (CUP)
Subject
Political Science and International Relations,Sociology and Political Science
Reference90 articles.
1. The Social Bases of Political Parties in Argentina, 1912–2003
2. From uniform distributions to Benford's law
3. Mebane Walter R. 2008a. Election forensics: Outlier and digit tests in America and Russia. Working paper.
4. Supervised machine learning: A review of classification techniques;Kotsiantis;Informatica,2007
Cited by
62 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献