Abstract
In the paper, we consider the methods of exact approximations of statistics probabilities distribution. As the exact approximations, we consider ∆-exact distributions. The difference between the ∆-exact distributions and the exact approximations does not exceed a predefined arbitrary small value ∆ that defines the accuracy of the approximations. Besides, we consider the methods of the first and second multiplicity, which use statistic characteristics of samples. The first multiplicity method is based on the properties of the components of the first multiplicity vector, which are nonnegative integer solutions of a linear equation. The linear equation relates the alphabet sign frequency and the sample size. The second multiplicity method is based on the solution of a system of linear equations. The linear equations of the system relate the sample size and the alphabet cardinality with the number of the alphabet signs that have equal frequency in the sample. For the considered methods of exact approximations, we give expressions to estimate the computational complexity of exact approximations of distributions for any sample parameters. To provide the approximations accuracy of 10–5, and the computing resource with the performance of 1018 operations per second, we calculated the sample parameters. For these samples, we can calculate the exact approximations of distributions, using the considered methods, the available computing resource, and the declared accuracy. We formed the parameter regions for the samples, and the exact approximations of distributions can be calculated for these samples with the help of various methods. We compared the regions themselves and with the so-called region of uncertainty, which is limited from above not more than 5-fold excess of the sample size over the alphabet cardinality. On the base of the comparison of the parameter regions of the samples, which are suitable for calculation of the exact approximations of the distributions, we compared their calculation methods. It is shown that owing to the second multiplicity method, we can make calculations for all values of the alphabet cardinality from 2 to 256. In contrast to the second multiplicity method, the first multiplicity method does not allow calculations for the alphabet cardinality over 73. The parameter region of the samples, which are suitable for calculation of the limit approximations of the distributions by the second multiplicity method, contains the complete parameter region of the samples, suitable for calculation of the limit approximations of the distributions by the first multiplicity method, and exceeds it more than in 52 times. Owing to the comparison of the methods of exact approximations, it is proved that if we have the same computing resource, we can calculate the exact approximations with the help of the second multiplicity method for a greater number of samples with the increased parameters in comparison with the first multiplicity method. Hence, to calculate the exact approximations of statistics probability distributions, we choose the second multiplicity method. Practical significance of the research is possibility of calculation of the maximal values of the sample parameters. The current technological level of computer systems allows calculation of the exact approximations of the distributions for these values, which provide the minimal loss of criteria efficiency in comparison with the limit approximations used for the sample parameters. The scientific novelty of the research is the comparative analysis of the methods of exact approximations of distributions for calculation of distributions for the sample parameters, which do not allow calculation of the exact distributions due to their high computational complexity.
Publisher
Izdatel'skii dom Spektr, LLC
Reference22 articles.
1. Mel'nikov A. K. (2018). Comparison of the efficiency of text processing when using exact and limiting approximations of the basic probability distributions of test statistics values in statistical criteria. Obozrenie prikladnoy i promyshlennoy matematiki, Vol. 25, (4), pp. 375 – 378. [in Russian language]
2. Mel'nikov A. K. (2018). Applying Exact Distributions in a Two-Step Word Processing Procedure. Obozrenie prikladnoy i promyshlennoy matematiki, Vol. 25, (2), pp. 89 – 95. [in Russian language]
3. Zelyukin N. B., Mel'nikov A. K. (2017). The complexity of calculating exact probability distributions of statistical values and the scope of application of limiting distributions. Electronic means and control systems: collection of materials of reports of the XIII International Scientific and Practical Conference, in 2 parts. pp. 84 – 90. Tomsk: Tomsk State University of Control Systems and Radioelectronics. [in Russian language]
4. Mel'nikov A. K. (2017). Complexity of calculating exact probability distributions of symmetric additively separable statistics and the scope of application of limiting distributions. Doklady TUSUR, Vol. 20, (4), pp. 126 – 130. [in Russian language]
5. Shurygin V. A. (2020). Complexity method of the theory of algorithms. Moscow: LENERD. [in Russian language]