Affiliation:
1. Institute of Operations Research and Systems Engineering, College of Science, Tianjin University of Technology, No. 391 Binshui Xi Road, Tianjin 300384, P. R. China
2. Glorious Sun School of Business & Management, Donghua University, Shanghai 200051, P. R. China
3. School of Science, Beijing University of Posts and Telecommunications, Beijing 100876, P. R. China
Abstract
Clustering is one of the most important problems in the fields of data mining, machine learning, and biological population division, etc. Moreover, robust variant for [Formula: see text]-means problem, which includes [Formula: see text]-means with penalties and [Formula: see text]-means with outliers, is also an active research branch. Most of these problems are NP-hard even the most classical problem, [Formula: see text]-means problem. For the NP-hard problems, the heuristic algorithm is a powerful method. When the quality of the output can be guaranteed, the algorithm is called an approximation algorithm. In this paper, combining two types of robust settings, we consider [Formula: see text]-means problem with penalties and outliers ([Formula: see text]-MPO). In the [Formula: see text]-MPO, we are given an [Formula: see text]-point set [Formula: see text], a penalty cost [Formula: see text] for each [Formula: see text], an integer [Formula: see text], and an integer [Formula: see text]. The target is to find a center subset [Formula: see text] with [Formula: see text], a penalty subset [Formula: see text] and an outlier subset [Formula: see text] with [Formula: see text], such that the sum of the total costs, including the connection cost and the penalty cost, is minimized. We offer an approximation algorithm using a heuristic local search scheme. Based on a single-swap manipulation, we obtain [Formula: see text]-approximation algorithm.
Funder
National Natural Science Foundation of China
Publisher
World Scientific Pub Co Pte Ltd
Subject
Management Science and Operations Research,General Medicine