SQLEM-Reference-Cited by-同舟云学术

SQLEM

Published:2000-06 Issue:2 Volume:29 Page:559-570
ISSN:0163-5808
Container-title:ACM SIGMOD Record
language:en
Short-container-title:SIGMOD Rec.

Author:

Ordonez Carlos¹,Cereghini Paul²

Affiliation:

1. College of Computing, Georgia Institute of Technology

2. Retail Solutions Group, NCR Corporation

Abstract

Clustering is one of the most important tasks performed in Data Mining applications. This paper presents an efficient SQL implementation of the EM algorithm to perform clustering in very large databases. Our version can effectively handle high dimensional data, a high number of clusters and more importantly, a very large number of data records. We present three strategies to implement EM in SQL: horizontal, vertical and a hybrid one. We expect this work to be useful for data mining programmers and users who want to cluster large data sets inside a relational DBMS.

Publisher

Association for Computing Machinery (ACM)

Subject

Information Systems,Software

Link

https://dl.acm.org/doi/pdf/10.1145/335191.335468

Reference17 articles.

1. Fast algorithms for projected clustering

2. Automatic subspace clustering of high dimensional data for data mining applications

3. Chad Carson Serge Belongie H. Greenspan and J. Malik. Region-based image querying. In 1EEE Workshop on Content-Based Access of Image and Video Libraries 1997. Chad Carson Serge Belongie H. Greenspan and J. Malik. Region-based image querying. In 1EEE Workshop on Content-Based Access of Image and Video Libraries 1997.

4. NonStop SQL/MX primitives for knowledge discovery

Cited by 16 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Implementing Efficient and Scalable In-Database Linear Regression in SQL;2021 IEEE International Conference on Big Data (Big Data);2021-12-15

2. Towards Expectation-Maximization by SQL in RDBMS;Database Systems for Advanced Applications;2021

3. Approximate Decision Tree Induction over Approximately Engineered Data Features;Rough Sets;2020

4. Big Data Analytics Using SQL: Quo Vadis?;Lecture Notes in Business Information Processing;2018

5. A new approximate query engine based on intelligent capture and fast transformations of granulated data summaries;Journal of Intelligent Information Systems;2017-07-05