Kendall transformation brings a robust categorical representation of ordinal data-Reference-Cited by-同舟云学术

Kendall transformation brings a robust categorical representation of ordinal data

Published:2022-05-18 Issue:1 Volume:12 Page:
ISSN:2045-2322
Container-title:Scientific Reports
language:en
Short-container-title:Sci Rep

Author:

Kursa Miron Bartosz

Abstract

AbstractKendall transformation is a conversion of an ordered feature into a vector of pairwise order relations between individual values. This way, it preserves ranking of observations and represents it in a categorical form. Such transformation allows for generalisation of methods requiring strictly categorical input, especially in the limit of small number of observations, when quantisation becomes problematic. In particular, many approaches of information theory can be directly applied to Kendall-transformed continuous data without relying on differential entropy or any additional parameters. Moreover, by filtering information to this contained in ranking, Kendall transformation leads to a better robustness at a reasonable cost of dropping sophisticated interactions which are anyhow unlikely to be correctly estimated. In bivariate analysis, Kendall transformation can be related to popular non-parametric methods, showing the soundness of the approach. The paper also demonstrates its efficiency in multivariate problems, as well as provides an example analysis of a real-world data.

Publisher

Springer Science and Business Media LLC

Subject

Multidisciplinary

Link

https://www.nature.com/articles/s41598-022-12224-2.pdf

Reference28 articles.

1. Shannon, C. E. A mathematical theory of communication. Bell Syst. Techn. J. 27, 379–423 (1948).

2. Smith, R. A mutual information approach to calculating nonlinearity. Stat 4, 291–303 (2015).

3. Brown, G., Pocock, A., Zhao, M.-J. & Lujan, M. Conditional likelihood maximisation: A unifying framework for information theoretic feature selection. J. Mach. Learn. Res. 13, 27–66 (2012).

4. Margolin, A. A. et al. ARACNE: An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 7, 1–15 (2006).

5. Brown, P. F., De Souza, P. V., Mercer, R. L., Pietra, V. J. D. & Lai, J. C. Class-based n-gram models of natural language. Comput. Linguist. 18, 467–479 (1992).

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Chasing parts in quadrillion: applications of dynamical downscaling in atmospheric pollutant transport modelling during field campaigns;Progress in Earth and Planetary Science;2024-07-02

2. Topic prediction for tobacco control based on COP9 tweets using machine learning techniques;PLOS ONE;2024-02-15

3. Kendall transfer entropy: a novel measure for estimating information transfer in complex systems;Journal of Neural Engineering;2023-07-20

4. Continuous ordinal patterns: Creating a bridge between ordinal analysis and deep learning;Chaos: An Interdisciplinary Journal of Nonlinear Science;2023-03-01

5. praznik: Tools for Information-Based Feature Selection and Scoring;CRAN: Contributed Packages;2017-11-20