Affiliation:
1. Universität Siegen , Department Maschinenbau , Institut für Mechanik und Regelungstechnik – Mechatronik , Paul-Bonatz-Str. 9-11 , Siegen , Germany
Abstract
Abstract
The task of data reduction is discussed and a novel selection approach which allows to control the optimal point distribution of the selected data subset is proposed. The proposed approach utilizes the estimation of probability density functions (pdfs). Due to its structure, the new method is capable of selecting a subset either by approximating the pdf of the original dataset or by approximating an arbitrary, desired target pdf. The new strategy evaluates the estimated pdfs solely on the selected data points, resulting in a simple and efficient algorithm with low computational and memory demand. The performance of the new approach is investigated for two different scenarios. For representative subset selection of a dataset, the new approach is compared to a recently proposed, more complex method and shows comparable results. For the demonstration of the capability of matching a target pdf, a uniform distribution is chosen as an example. Here the new method is compared to strategies for space-filling design of experiments and shows convincing results.
Subject
Electrical and Electronic Engineering,Computer Science Applications,Control and Systems Engineering
Reference22 articles.
1. Paolo Brandimarte. Low-discrepancy sequences. Handbook in Monte Carlo Simulation: Applications in Financial Engineering, Risk Management, and Economics, pages 379–401, 2014.
2. Petros Drineas and Michael W. Mahoney. On the Nyström method for approximating a gram matrix for improved kernel-based learning. Journal of Machine Learning Research, pages 2153–2175, 2005.
3. Tobias Ebert, Torsten Fischer, Julian Belz, Tim Heinz, Geritt Kampmann and Oliver Nelles. Extended deterministic local search algorithm for maximin latin hypercube designs. In 2015 IEEE Symposium Series on Computational Intelligence: IEEE Symposium on Computational Intelligence in Control and Automation (2015 IEEE CICA), Cape Town, South Africa, December 2015.
4. Pedro M. Ferreira. Unsupervised entropy-based selection of data sets for improved model fitting. In IEEE International Joint Conference on Neural Networks, Vancouver, BC Canada, August 2016.
5. Tim Oliver Heinz, Tobias Münker and Oliver Nelles. Data distribution assessment and optimal splitting of data sets. (Manuscript submitted for publication.) In 2019 International Joint Conference on Neural Networks, IJCNN 2019, 2019.
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献