Affiliation:
1. GSK Department, Niels Brock, Copenhagen Business College, Nørre Voldgade 34, 1358 Copenhagen K, Denmark
Abstract
Rate distortion theory was developed for optimizing lossy compression of data, but it also has applications in statistics. In this paper, we illustrate how rate distortion theory can be used to analyze various datasets. The analysis involves testing, identification of outliers, choice of compression rate, calculation of optimal reconstruction points, and assigning “descriptive confidence regions” to the reconstruction points. We study four models or datasets of increasing complexity: clustering, Gaussian models, linear regression, and a dataset describing orientations of early Islamic mosques. These examples illustrate how rate distortion analysis may serve as a common framework for handling different statistical problems.
Subject
General Physics and Astronomy
Reference59 articles.
1. On Information and Sufficiency;Kullback;Ann. Math. Stat.,1951
2. Kullback, S. (1959). Information Theory and Statistics, Wiley.
3. Csiszár, I., and Shields, P. (2004). Information Theory and Statistics: A Tutorial, Now Publishers Inc.. Foundations and Trends in Communications and Information Theory.
4. Efficiencies of chi-square and likelihood ratio goodness-of-fit tests;Quine;Ann. Stat.,1985
5. Harremoës, P., and Vajda, I. (2007, January 24–29). Entropy Testing is Efficient. Proceedings of the 2007 IEEE International Symposium on Information Theory, Nice, France.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献