Multi-Label Quantification

Author:

Moreo Alejandro1ORCID,Francisco Manuel2ORCID,Sebastiani Fabrizio1ORCID

Affiliation:

1. Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche, Italy

2. Department of Computer Science and Artificial Intelligence, University of Granada, Spain

Abstract

Quantification, variously called supervised prevalence estimation or learning to quantify , is the supervised learning task of generating predictors of the relative frequencies (a.k.a. prevalence values ) of the classes of interest in unlabelled data samples. While many quantification methods have been proposed in the past for binary problems and, to a lesser extent, single-label multiclass problems, the multi-label setting (i.e., the scenario in which the classes of interest are not mutually exclusive) remains by and large unexplored. A straightforward solution to the multi-label quantification problem could simply consist of recasting the problem as a set of independent binary quantification problems. Such a solution is simple but naïve, since the independence assumption upon which it rests is, in most cases, not satisfied. In these cases, knowing the relative frequency of one class could be of help in determining the prevalence of other related classes. We propose the first truly multi-label quantification methods, i.e., methods for inferring estimators of class prevalence values that strive to leverage the stochastic dependencies among the classes of interest in order to predict their relative frequencies more accurately. We show empirical evidence that natively multi-label solutions outperform the naïve approaches by a large margin. The code to reproduce all our experiments is available online.

Funder

European Commission

Italian Ministry of University

FPI 2017 predoctoral programme, from the Spanish Ministry of Economy and Competitiveness

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Reference71 articles.

1. DiSMEC

2. Quantification via Probability Estimators

3. Mirko Bunse. 2022. Machine Learning for Acquiring Knowledge in Astro-particle Physics. Ph. D. Dissertation. University of Dortmund, Dortmund, DE.

4. Mirko Bunse. 2022. On multi-class extensions of adjusted classify and count. In Proceedings of the 2nd International Workshop on Learning to Quantify (LQ 2022). Grenoble, IT, 43–50.

5. Mirko Bunse, Alejandro Moreo, Fabrizio Sebastiani, and Martin Senz. 2022. Ordinal quantification through regularization. In Proceedings of the 33rd European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD 2022). Grenoble, FR.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3