Evolutionary dataset optimisation: learning algorithm quality through evolution-Reference-Cited by-同舟云学术

Evolutionary dataset optimisation: learning algorithm quality through evolution

Published:2019-12-27 Issue:4 Volume:50 Page:1172-1191
ISSN:0924-669X
Container-title:Applied Intelligence
language:en
Short-container-title:Appl Intell

Author:

Wilde Henry^ORCID,Knight Vincent,Gillard Jonathan

Abstract

AbstractIn this paper we propose a novel method for learning how algorithms perform. Classically, algorithms are compared on a finite number of existing (or newly simulated) benchmark datasets based on some fixed metrics. The algorithm(s) with the smallest value of this metric are chosen to be the ‘best performing’. We offer a new approach to flip this paradigm. We instead aim to gain a richer picture of the performance of an algorithm by generating artificial data through genetic evolution, the purpose of which is to create populations of datasets for which a particular algorithm performs well on a given metric. These datasets can be studied so as to learn what attributes lead to a particular progression of a given algorithm. Following a detailed description of the algorithm as well as a brief description of an open source implementation, a case study in clustering is presented. This case study demonstrates the performance and nuances of the method which we call Evolutionary Dataset Optimisation. In this study, a number of known properties about preferable datasets for the clustering algorithms known as k-means and DBSCAN are realised in the generated datasets.

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence

Link

http://link.springer.com/content/pdf/10.1007/s10489-019-01592-4.pdf

Reference38 articles.

1. Abualigah LM, Khader AT, Hanandeh ES (2018) A combination of objective functions and hybrid krill herd algorithm for text document clustering analysis. Eng Appl Artif Intel 73(Int J Comput Sci Eng Appl 5 1 2015):111–125. https://doi.org/10.1016/j.engappai.2018.05.003

2. Abualigah LM, Khader AT, Hanandeh ES (2018) Hybrid clustering analysis using improved krill herd algorithm. Appl Intell 48(11):4047–4071. https://doi.org/10.1007/s10489-018-1190-6

3. Amirjanov A (2016) Modeling the dynamics of a changing range genetic algorithm. Procedia Comput Sci 102:570–577. https://doi.org/10.1016/j.procs.2016.09.444

4. Bäck T (1994) Selective pressure in evolutionary algorithms: a characterization of selection mechanisms. In: Proceedings of the first IEEE conference on evolutionary computation. IEEE World congress on computational intelligence, pp 57–62, DOI https://doi.org/10.1109/ICEC.1994.350042

5. Campos G, Zimek A, Sander J, Campello R, Micenková B, Schubert E, Assent I, Houle M (2016) On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study. Data Min Knowl Disc 30(4):891–927. https://doi.org/10.1007/s10618-015-0444-8

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Risk-Aware and Explainable Framework for Ensuring Guaranteed Coverage in Evolving Hardware Trojan Detection;2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD);2023-10-28

2. A novel initialisation based on hospital-resident assignment for the $$k$$-modes algorithm;Soft Computing;2023-05-23

3. An Evolutionary Strategy Based Training Optimization of Supervised Machine Learning Algorithms (EStoTimeSMLAs);2022 5th Asia Conference on Machine Learning and Computing (ACMLC);2022-12

4. Large-scale group decision-making (LSGDM) for performance measurement of healthcare construction projects: Ordinal Priority Approach;Applied Intelligence;2022-09-07

5. An efficient salp swarm algorithm based on scale-free informed followers with self-adaption weight;Applied Intelligence;2022-05-02