Construct a biased SVM classifier based on Chebyshev distance for PU learning-Reference-Cited by-同舟云学术

Construct a biased SVM classifier based on Chebyshev distance for PU learning

Published:2020-10-07 Issue:3 Volume:39 Page:3749-3767
ISSN:1064-1246
Container-title:Journal of Intelligent & Fuzzy Systems
language:
Short-container-title:IFS

Author:

Ke Ting¹,Li Min¹,Zhang Lidong¹,Lv Hui¹,Ge Xuechun²

Affiliation:

1. Department of Mathematics, College of Science, Tianjin University of Science & Technology, Tianjin, China

2. China Academy of Railway Sciences Signal and Communication Research Institute (Beijing Huatie Information Technology Corporation), Beijing, China

Abstract

In some real applications, only limited labeled positive examples and many unlabeled examples are available, but there are no negative examples. Such learning is termed as positive and unlabeled (PU) learning. PU learning algorithm has been studied extensively in recent years. However, the classical ones based on the Support Vector Machines (SVMs) are assumed that labeled positive data is independent and identically distributed (i.i.d) and the sample size is large enough. It leads to two obvious shortcomings. On the one hand, the performance is not satisfactory, especially when the number of the labeled positive examples is small. On the other hand, classification results are not optimistic when datasets are Non-i.i.d. For this reason, this paper proposes a novel SVM classifier using Chebyshev distance to measure the empirical risk and designs an efficient iterative algorithm, named L∞ - BSVM in short. L∞ - BSVM includes the following merits: (1) it allows all sample points to participate in learning to prompt classification performance, especially in the case where the size of labeled data is small; (2) it minimizes the distance of the sample points that are (outliers in Non-i.i.d) farthest from the hyper-plane, where outliers are sufficiently taken into consideration (3) our iterative algorithm can solve large scale optimization problem with low time complexity and ensure the convergence of the optimum solution. Finally, extensive experiments on three types of datasets: artificial Non-i.i.d datasets, fault diagnosis of railway turnout with few labeled data (abnormal turnout) and six benchmark real-world datasets verify above opinions again and demonstrate that our classifier is much better than state-of-the-art competitors, such as B-SVM, LUHC, Pulce, B-LSSVM, NB and so on.

Publisher

IOS Press

Subject

Artificial Intelligence,General Engineering,Statistics and Probability

Reference27 articles.

1. PUMAD: PU Metric learning for anomaly detection;Ju;Information Sciences,2020

2. Han B. , Tomoya S. and Issei S. , Convex Formulation of Multiple Instance Learning from Positive and Unlabeled Bags, 105 (2018), 132–141.

3. Positive-unlabeled learning of glycosylation sites in the human proteome;Li;BMC Bioinformatics,2019

4. Positive unlabeled learning for deriving protein interaction networks;Kilic;Network Modeling and Analysis in Health Informatics and Bioinformatics,2010

5. PAC Learning from Positive Statistical Queries;Denis;Lecture Notes in Computer Science,2015

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Absolute Value Inequality SVM for the PU Learning Problem;Mathematics;2024-05-08

2. An Improved Dempster–Shafer Evidence Theory Based on the Chebyshev Distance and Its Application in Rock Burst Prewarnings;ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering;2024-03

3. An improved multi-attribute group decision-making method for selecting the green supplier of community elderly healthcare service;Frontiers in Energy Research;2024-02-06