Abstract
The procedure of outliers detection in univariate circular data can be developed using clustering algorithm. In clustering, it is necessary to calculate the similarity measure in order to cluster the observations into their own group. The similarity measure in circular data can be determined by calculating circular distance between each point of angular observation. In this paper, clustering-based procedure for outlier detection in univariate circular biological data with different similarity distance measures will be developed and the performance will be investigated. Three different circular similarity distance measures are used for the outliers detection procedure using single-linkage clustering algorithm. However, there are two similarity measures namely Satari distance and Di distance that are found to have similarity in formula for univariate circular data. The aim of this study is to develop and demonstrate the effectiveness of proposed clustering-based procedure with different similarity distance measure in detecting outliers. Therefore, in this study the circular similarity distance of SL-Satari/Di and another similarity measure namely SL-Chang will be compared at certain cutting rule. It is found that clustering-based procedure using single-linkage algorithm with different similarity distances are applicable and promising approach for outlier detection in univariate circular data, particularly for biological data. The result also found that at a certain condition of data, the SL-Satari/Di distance seems to overperform the performance of SL-Chang distance.
Funder
Ministry of Higher Education, Malaysia
Universiti Malaysia Pahang
Publisher
Pakistan Journal of Statistics and Operation Research
Subject
Management Science and Operations Research,Statistics, Probability and Uncertainty,Modeling and Simulation,Statistics and Probability
Reference22 articles.
1. Abuzaid, A. H. (2012). Analysis of Mother’s Day celebration via circular statistics. The Philippine Statistician, 61(2), 39–52.
2. Abuzaid, A. H. (2013). On the Influential Points in the Functional Circular Relationship Models. Pakistan Journal of Statistics and Operation Research, 9(3), 333–342.
3. Abuzaid, A. H. (2020). Identifying density-based local outliers in medical multivariate circular data. Statistics in Medicine, 1–6.
4. Abuzaid, A. H., Hussin, A. G., Rambli, A., & Mohamed, I. (2012). Statistics for a New Test of Discordance in Circular Data. Communications in Statistics—Simulation and Computation, 41, 1882–1890.
5. Abuzaid, A. H., Mohamed, I. B., & Hussin, A. G. (2009). A New Test of Discordancy in Circular Data. Communications in Statistics - Simulation and Computation, 38(4), 682–691.