Theory and Algorithms for Shapelet-Based Multiple-Instance Learning-Reference-Cited by-同舟云学术

Theory and Algorithms for Shapelet-Based Multiple-Instance Learning

Published:2020-08 Issue:8 Volume:32 Page:1580-1613
ISSN:0899-7667
Container-title:Neural Computation
language:en
Short-container-title:Neural Computation

Author:

Suehiro Daiki¹,Hatano Kohei²,Takimoto Eiji³,Yamamoto Shuji⁴,Bannai Kenichi⁴,Takeda Akiko⁵

Affiliation:

1. Department of Advanced Information Technology, Faculty of Information Science and Electrical Engineering, Kyushu University, and RIKEN Center for Advanced Intelligence Project, Nishi-ku, Fukuoka, 8190395, Japan

2. Faculty of Arts and Science, Kyushu University, and RIKEN Center for Advanced Intelligence Project, Nishi-ku, Fukuoka, 8190395, Japan

3. Department of Informatics, Faculty of Information Science and Electrical Engineering, Kyushu University, Nishi-ku, Fukuoka, 8190395, Japan

4. Department of Mathematics, Keio University, and RIKEN Center for Advanced Intelligence Project, Minatokita-ku, Yokohama, 2238522, Japan

5. Department of Creative Informatics, University of Tokyo, and RIKEN Center for Advanced Intelligence Project, Bunkyo-ku, Tokyo, 1138656, Japan

Abstract

We propose a new formulation of multiple-instance learning (MIL), in which a unit of data consists of a set of instances called a bag. The goal is to find a good classifier of bags based on the similarity with a “shapelet” (or pattern), where the similarity of a bag with a shapelet is the maximum similarity of instances in the bag. In previous work, some of the training instances have been chosen as shapelets with no theoretical justification. In our formulation, we use all possible, and thus infinitely many, shapelets, resulting in a richer class of classifiers. We show that the formulation is tractable, that is, it can be reduced through linear programming boosting (LPBoost) to difference of convex (DC) programs of finite (actually polynomial) size. Our theoretical result also gives justification to the heuristics of some previous work. The time complexity of the proposed algorithm highly depends on the size of the set of all instances in the training sample. To apply to the data containing a large number of instances, we also propose a heuristic option of the algorithm without the loss of the theoretical guarantee. Our empirical study demonstrates that our algorithm uniformly works for shapelet learning tasks on time-series classification and various MIL tasks with comparable accuracy to the existing methods. Moreover, we show that the proposed heuristics allow us to achieve the result in reasonable computational time.

Publisher

MIT Press - Journals

Subject

Cognitive Neuroscience,Arts and Humanities (miscellaneous)

Link

https://www.mitpressjournals.org/doi/pdf/10.1162/neco_a_01297

Reference38 articles.

1. A Boosting Approach to Multiple Instance Learning

2. The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances

3. 10.1162/153244303321897690

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Research on transformer fault diagnosis: Based on improved firefly algorithm optimized LPboost–classification and regression tree;IET Generation, Transmission & Distribution;2021-06-22