Affiliation:
1. College of Information Engineering, Zhejiang University of Technology, Hangzhou 323000, China
2. College of Engineering, Lishui University, Lishui 323000, China
Abstract
Multilabel data share important features, including label imbalance, which has a significant influence on the performance of classifiers. Because of this problem, a widely used multilabel classification algorithm, the multilabel k-nearest neighbor (ML-kNN) algorithm, has poor performance on imbalanced multilabel data. To address this problem, this study proposes an improved ML-kNN algorithm based on value and weight. In this improved algorithm, labels are divided into minority and majority, and different strategies are adopted for different labels. By considering the label of latent information carried by the nearest neighbors, a value calculation method is proposed and used to directly classify majority labels. Additionally, to address the misclassification problem caused by a lack of nearest neighbor information for minority labels, weight calculation is proposed. The proposed weight calculation converts distance information with and without label sets in the nearest neighbors into weights. The experimental results on multilabel datasets from different benchmarks demonstrate the performance of the algorithm, especially for datasets with high imbalance. Different evaluation metrics show that the results are improved by approximately 2–10%. The verified algorithm could be applied to a multilabel classification of various fields involving label imbalance, such as drug molecule identification, building identification, and text categorization.
Funder
Science and Technology Key Research Planning Project of Zhejiang Province, China
Lishui Major Research and Development Program, China
Postdoctoral Research Program of Zhejiang University of Technology
the Public Welfare Technology Application Research Program Project of Lishui, China
Subject
Applied Mathematics,Modeling and Simulation,General Computer Science,Theoretical Computer Science
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献