Clustering lines in high-dimensional space

Author:

Gao Jie1,Langberg Michael2,Schulman Leonard J.3

Affiliation:

1. Stony Brook University, NY

2. The Open University of Israel, Israel

3. California Institute of Technology, CA

Abstract

A set of k balls B 1 , …, B k in a Euclidean space is said to cover a collection of lines if every line intersects some ball. We consider the k - center problem for lines in high-dimensional space: Given a set of n lines l = { l 1 ,…, l n in R d , find k balls of minimum radius which cover l . We present a 2-approximation algorithm for the cases k = 2, 3 of this problem, having running time quasi-linear in the number of lines and the dimension of the ambient space. Our result for 3-clustering is strongly based on a new result in discrete geometry that may be of independent interest: a Helly-type theorem for collections of axis-parallel “crosses” in the plane. The family of crosses does not have finite Helly number in the usual sense. Our Helly theorem is of a new type: it depends on ε-contracting the sets. In statistical practice, data is often incompletely specified; we consider lines as the most elementary case of incompletely specified data points. Clustering of data is a key primitive in nonparametric statistics. Our results provide a way of performing this primitive on incomplete data, as well as imputing the missing values.

Funder

National Science Foundation

NSA

Division of Computing and Communication Foundations

Publisher

Association for Computing Machinery (ACM)

Subject

Mathematics (miscellaneous)

Reference25 articles.

1. Agarwal P. Har-Peled S. and Varadarajan K. R. 2005a. Geometric approximation via coresets. In Current Trends in Combinatorial and Computational Geometry. Cambridge University Press. Agarwal P. Har-Peled S. and Varadarajan K. R. 2005a. Geometric approximation via coresets. In Current Trends in Combinatorial and Computational Geometry. Cambridge University Press.

2. Agarwal P. K. Arge L. and Yi K. 2005b. An optimal dynamic interval stabbing-max data structure? In Proceedings of the 16th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'05). SIAM 803--812. Agarwal P. K. Arge L. and Yi K. 2005b. An optimal dynamic interval stabbing-max data structure? In Proceedings of the 16th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'05). SIAM 803--812.

3. A (1+ε)-approximation algorithm for 2-line-center

4. Allison P. D. 2002. Missing Data. Sage Publications. Allison P. D. 2002. Missing Data. Sage Publications.

Cited by 5 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Linear-time approximation scheme for k-means clustering of axis-parallel affine subspaces;Computational Geometry;2023-06

2. The complexity of binary matrix completion under diameter constraints;Journal of Computer and System Sciences;2023-03

3. Improved Separated Red-Blue Center Clustering;Lecture Notes in Computer Science;2022

4. Hyperspectral Inverse Skinning;Computer Graphics Forum;2020-02-26

5. Helly’s theorem: New variations and applications;Algebraic and Geometric Methods in Discrete Mathematics;2017

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3