Author:
Mutalib Sharifah Sakinah Syed Abd,Satari Siti Zanariah,Yusoff Wan Nur Syahidah Wan
Abstract
Abstract
Detecting outliers for multivariate data is difficult and does not work by visual inspection. Mahalanobis distance (MD) has been a classical method to detect outliers in multivariate data. However, classical mean and covariance matrix in MD suffer from masking and swamping effects. Masking effects happened when outliers are not identified and swamping effects happened when inliers are identified as outliers. Hence, robust estimators have been proposed to overcome these problems. In this study, the performance of a new robust estimator named Test on Covariance (TOC) is tested and compared with other robust estimators which are Fast Minimum Covariance Determinant (FMCD), Minimum Vector Variance (MVV), Covariance Matrix Equality (CME) and Index Set Equality (ISE). These five robust estimators’ performance is being tested on five real multivariate datasets. Brain and weight, Hawkins-Bradu Kass, Stackloss, Bushfire and Milk datasets were used as these five real datasets are well-known in most outlier detection studies. Results show that TOC has proven to be able in detecting outliers, does not have a masking effect and has the same performance as other robust estimators in all datasets.
Subject
General Physics and Astronomy
Reference34 articles.
1. Unmasking multivariate outliers and leverage points;Rousseeuw;J. Am. Stat. Assoc,1990
2. Robust multivariate outlier labeling;Herwindiati;Commun. Stat. Simul. Comput.,2007
3. Detection of outliers;Hadi;Wiley Interdiscip. Rev. Comput. Stat.,2009
4. Robust statistics for outlier detection;Rousseeuw;Wiley Interdiscip Rev Data Min. Knowl. Discov,2011
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献