Adaptive covariate acquisition for minimizing total cost of classification-Reference-Cited by-同舟云学术

Adaptive covariate acquisition for minimizing total cost of classification

Published:2021-04-18 Issue:5 Volume:110 Page:1067-1104
ISSN:0885-6125
Container-title:Machine Learning
language:en
Short-container-title:Mach Learn

Author:

Andrade Daniel^ORCID,Okajima Yuzuru

Abstract

AbstractIn some applications, acquiring covariates comes at a cost which is not negligible. For example in the medical domain, in order to classify whether a patient has diabetes or not, measuring glucose tolerance can be expensive. Assuming that the cost of each covariate, and the cost of misclassification can be specified by the user, our goal is to minimize the (expected) total cost of classification, i.e. the cost of misclassification plus the cost of the acquired covariates. We formalize this optimization goal using the (conditional) Bayes risk and describe the optimal solution using a recursive procedure. Since the procedure is computationally infeasible, we consequently introduce two assumptions: (1) the optimal classifier can be represented by a generalized additive model, (2) the optimal sets of covariates are limited to a sequence of sets of increasing size. We show that under these two assumptions, a computationally efficient solution exists. Furthermore, on several medical datasets, we show that the proposed method achieves in most situations the lowest total costs when compared to various previous methods. Finally, we weaken the requirement on the user to specify all misclassification costs by allowing the user to specify the minimally acceptable recall (target recall). Our experiments confirm that the proposed method achieves the target recall while minimizing the false discovery rate and the covariate acquisition costs better than previous methods.

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Software

Link

https://link.springer.com/content/pdf/10.1007/s10994-021-05958-z.pdf

Reference51 articles.

1. Anderson, T. W. (2003). An introduction to multivariate statistical analysis (Vol. 2). Wiley.

2. Andrade, D., & Okajima, Y. (2019). Efficient bayes risk estimation for cost-sensitive classification. In The 22nd international conference on artificial intelligence and statistics (pp. 3372–3381)

3. Bayer-Zubek, V. (2004). Learning diagnostic policies from examples by systematic search. In Proceedings of the 20th conference on uncertainty in artificial intelligence (pp. 27–34). AUAI Press.

4. Benbouzid, D., Busa-Fekete, R., & Kégl, B. (2012). Fast classification using sparse decision dags. In Proceedings of the 29th international conference on international conference on machine learning (pp. 747–754)

5. Berger, J. O. (2013). Statistical decision theory and Bayesian analysis. Springer.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Optimal selection of sample-size dependent common subsets of covariates for multi-task regression prediction;Electronic Journal of Statistics;2021-01-01