Multi-Instance Learning with One Side Label Noise-Reference-Cited by-同舟云学术

Multi-Instance Learning with One Side Label Noise

Published:2024-03-26 Issue:5 Volume:18 Page:1-24
ISSN:1556-4681
Container-title:ACM Transactions on Knowledge Discovery from Data
language:en
Short-container-title:ACM Trans. Knowl. Discov. Data

Author:

Luan Tianxiang¹^ORCID,Gu Shilin¹^ORCID,Tang Xijia¹^ORCID,Zhuge Wenzhang¹^ORCID,Hou Chenping¹^ORCID

Affiliation:

1. College of Science, National University of Defense Technology, Changsha, China

Abstract

Multi-instance Learning (MIL) is a popular learning paradigm arising from many real applications. It assigns a label to a set of instances, which is called a bag, and the bag’s label is determined by the instances within it. A bag is positive if and only if it has at least one positive instance. Since labeling bags is more complicated than labeling each instance, we will often face the mislabeling problem in MIL. Furthermore, it is more common that a negative bag has been mislabeled to a positive one, since one mislabeled instance will lead to the change of the whole bag label. This is an important problem that originated from real applications, e.g., web mining and image classification, but little research has concentrated on it as far as we know. In this article, we focus on this MIL problem with one side label noise that the negative bags are mislabeled as positive ones. To address this challenging problem, we propose, to the best our our knowledge, a novel multi-instance learning method with one side label noise. We design a new double weighting approach under traditional framework to characterize the “faithfulness” of each instance and each bag in learning the classifier. Briefly, on the instance level, we employ a sparse weighting method to select the key instances, and the MIL problem with one size label noise is converted to a mislabeled supervised learning scenario. On the bag level, the weights of bags, together with the selected key instances, will be utilized to identify the real positive bags. In addition, we have solved our proposed model by an alternative iteration method with proved convergence behavior. Empirical studies on various datasets have validated the effectiveness of our method.

Funder

National Key Research and Development Program

Key NSF of China

NSF of China

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3644076

Reference47 articles.

1. Jaume Amores. 2015. MILDE: multiple instance learning by discriminative embedding. Knowl. Inf. Syst. 42 2 (2015) 381–407. DOI:10.1007/S10115-013-0711-1

2. Stuart Andrews, Ioannis Tsochantaridis, and Thomas Hofmann. 2002. Support vector machines for multiple-instance learning. In NIPS’02. 561–568.

3. Christian Blaschke Eduardo Andrés León Martin Krallinger and Alfonso Valencia. 2005. Evaluation of BioCreAtIvE assessment of task 2. BMC Bioinform. 6 S-1 (2005). DOI:10.1186/1471-2105-6-S1-S16

4. Stephen P. Boyd Neal Parikh Eric Chu Borja Peleato and Jonathan Eckstein. 2011. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3 1 (2011) 1–122. DOI:10.1561/2200000016

5. Chad Carson, Megan Thomas, Serge J. Belongie, Joseph M. Hellerstein, and Jitendra Malik. 1999. Blobworld: A system for region-based image indexing and retrieval. In VISUAL’99. 509–516.