Author:
Kapadia Kumash,Abdel-Jaber Hussein,Thabtah Fadi,Hadi Wael
Abstract
Indian Premier League (IPL) is one of the more popular cricket world tournaments, and its financial is increasing each season, its viewership has increased markedly and the betting market for IPL is growing significantly every year. With cricket being a very dynamic game, bettors and bookies are incentivised to bet on the match results because it is a game that changes ball-by-ball. This paper investigates machine learning technology to deal with the problem of predicting cricket match results based on historical match data of the IPL. Influential features of the dataset have been identified using filter-based methods including Correlation-based Feature Selection, Information Gain (IG), ReliefF and Wrapper. More importantly, machine learning techniques including Naïve Bayes, Random Forest, K-Nearest Neighbour (KNN) and Model Trees (classification via regression) have been adopted to generate predictive models from distinctive feature sets derived by the filter-based methods. Two featured subsets were formulated, one based on home team advantage and other based on Toss decision. Selected machine learning techniques were applied on both feature sets to determine a predictive model. Experimental tests show that tree-based models particularly Random Forest performed better in terms of accuracy, precision and recall metrics when compared to probabilistic and statistical models. However, on the Toss featured subset, none of the considered machine learning algorithms performed well in producing accurate predictive models.
Subject
Computer Science Applications,Information Systems,Software
Reference32 articles.
1. An experimental study of three different rule ranking formulas in associative classification mining,2012
2. Anaconda software distribution;Computer software Vers,2017
3. Bhatia, G., 2017. The richest sport in India just keeps getting richer. Retrieved December 25, 2018 from https://www.cnbc.com/2017/09/27/indian-premier-league-cricket-a-rich-sport-is-getting-a-lot-richer.html.
4. Multinomial logistic regression algorithm;Ann. Inst. Stat. Math.,1992
5. Random forests;Machine Learn,2001
Cited by
35 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献