A Novel Density-based Technique for Outlier Detection of High Dimensional Data Utilizing Full Feature Space-Reference-Cited by-同舟云学术

A Novel Density-based Technique for Outlier Detection of High Dimensional Data Utilizing Full Feature Space

Published:2021-03-25 Issue:1 Volume:50 Page:138-152
ISSN:2335-884X
Container-title:Information Technology and Control
language:
Short-container-title:ITC

Author:

Ur Rehman Mujeeb,Muhammad Khan Dost

Abstract

Recently, anomaly detection has acquired a realistic response from data mining scientists as a graph of its reputation has increased smoothly in various practical domains like product marketing, fraud detection, medical diagnosis, fault detection and so many other fields. High dimensional data subjected to outlier detection poses exceptional challenges for data mining experts and it is because of natural problems of the curse of dimensionality and resemblance of distant and adjoining points. Traditional algorithms and techniques were experimented on full feature space regarding outlier detection. Customary methodologies concentrate largely on low dimensional data and hence show ineffectiveness while discovering anomalies in a data set comprised of a high number of dimensions. It becomes a very difficult and tiresome job to dig out anomalies present in high dimensional data set when all subsets of projections need to be explored. All data points in high dimensional data behave like similar observations because of its intrinsic feature i.e., the distance between observations approaches to zero as the number of dimensions extends towards infinity. This research work proposes a novel technique that explores deviation among all data points and embeds its findings inside well established density-based techniques. This is a state of art technique as it gives a new breadth of research towards resolving inherent problems of high dimensional data where outliers reside within clusters having different densities. A high dimensional dataset from UCI Machine Learning Repository is chosen to test the proposed technique and then its results are compared with that of density-based techniques to evaluate its efficiency.

Publisher

Kaunas University of Technology (KTU)

Subject

Electrical and Electronic Engineering,Computer Science Applications,Control and Systems Engineering

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Clustering Techniques in Data Mining: A Survey of Methods, Challenges, and Applications;Computer Science;2024-03-05

2. Adaboost-based SVDD for anomaly detection with dictionary learning;Expert Systems with Applications;2024-03

3. Semi-supervised deep density clustering;Applied Soft Computing;2023-11

4. Detecting Outliers in Data Streams Based on Minimum Rare Pattern Mining and Pattern Matching;Information Technology and Control;2022-06-23

5. Graph Convolutional Networks and Attention-Based Outlier Detection;IEEE Access;2022