Abstract
AbstractToday, data mining and gene expressions are at the forefront of modern data analysis. Here we introduce a novel probability distribution that is applicable in these fields. This paper develops the proposed spherical-Dirichlet distribution designed to fit vectors located at the positive orthant of the hypersphere, as it is often the case for data in these fields, avoiding unnecessary probability mass. Basic properties of the proposed distribution, including normalizing constants and moments are developed. Relationships with other distributions are also explored. Estimators based on classical inferential statistics, such as method of moments and maximum likelihood estimators are obtained. Two applications are developed: the first one uses simulated data, and the second uses a real text mining example. Both examples are fitted using the proposed spherical-Dirichlet distribution and their results are discussed.
Publisher
Springer Science and Business Media LLC
Subject
Statistics, Probability and Uncertainty,Computer Science Applications,Statistics and Probability
Reference9 articles.
1. Frame, J. S.: An approximation to the quotient of gamma function. Am. Math. Mon. 56(8), 529–535 (1949).
2. Kent, J. T.: The Fisher-Bingham distribution on the sphere. J. R. Stat. Soc. Ser. B Methodol. 44(1), 71–80 (1982). https://doi.org/10.1111/j.2517-6161.1982.tb01189.x. https://rss.onlinelibrary.wiley.com/doi/pdf/10.1111/j.2517-6161.1982.tb01189.x.
3. Lang, K.: CMU Text Learning Group Data Archives. https://www.cs.cmu.edu/afs/cs/project/theo-20/www/data/news20.html. Accessed 1 Sep 2019.
4. Mardia, K. V.: Statistics of directional data. J. R. Stat. Soc. Ser. B Methodol. 37(3), 349–393 (1975).
5. Mardia, K. V., Jupp, P. E.: Directional Statistics, 2nd edn. Wiley series in probability and statistics. Wiley, Chichester (2000).
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献