Abstract
AbstractFeature selection reduces the complexity of high-dimensional datasets and helps to gain insights into systematic variation in the data. These aspects are essential in domains that rely on model interpretability, such as life sciences. We propose a (U)ser-Guided (Bay)esian Framework for (F)eature (S)election, UBayFS, an ensemble feature selection technique embedded in a Bayesian statistical framework. Our generic approach considers two sources of information: data and domain knowledge. From data, we build an ensemble of feature selectors, described by a multinomial likelihood model. Using domain knowledge, the user guides UBayFS by weighting features and penalizing feature blocks or combinations, implemented via a Dirichlet-type prior distribution. Hence, the framework combines three main aspects: ensemble feature selection, expert knowledge, and side constraints. Our experiments demonstrate that UBayFS (a) allows for a balanced trade-off between user knowledge and data observations and (b) achieves accurate and robust results.
Funder
Kreftforeningen
Norwegian University of Life Sciences
Publisher
Springer Science and Business Media LLC
Subject
Artificial Intelligence,Software
Reference47 articles.
1. Bishop, C. M. (1995). Neural networks for pattern recognition. Oxford University Press.
2. Bose, S., Das, C., Banerjee, A., Ghosh, K., Chattopadhyay, M., Chattopadhyay, S., & Barik, A. (2021). An ensemble machine learning model based on multiple filtering and supervised attribute clustering algorithm for classifying cancer samples. Peer J Computer Science, 7, e671.
3. Brahim, A. B., & Limam, M. (2014). New prior knowledge based extensions for stable feature selection. In 2014 6th international conference of soft computing and pattern recognition (SoCPaR) (pp. 306–311).
4. Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
5. Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. Taylor & Francis.
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献