Affiliation:
1. Emperor Alexander I St. Petersburg State Transport University
2. Institute for Problems in Mechanical Engineering of the Russian Academy of Sciences
Abstract
Aim. The paper aims to research the methods of evaluating dependability indicators based on the analysis of Big Data obtained in the course of monitoring the operation of technical systems and their components. A comparative analysis is performed of the efficiency of robust statistical methods of evaluating the dependability indicators of complex technical systems based on operation data.Methods. The paper uses methods of mathematical statistics, specifically robust methods of evaluating the translation parameter of a noisy sample and numerical methods of statistical modeling. The authors consider five methods of evaluating the translation parameter: sample mean as a nonrobust method used for comparison; sample median as the simplest robust method of evaluating the translation parameter; two-stage evaluation procedure with truncation of outliers according to the three sigma rule; two-stage evaluation procedure with truncation of outliers using Tukey’s box-and-whisker plot; Huber’s robust method. The comparative analysis of the methods of evaluating system dependability indicators was conducted by means of statistical modeling in the R statistical computation package. Five distribution laws for generating an element’s time-to-failure and recovery time samples were considered: exponential distribution, Weibull distribution, log-normal distribution, gamma distribution and uniform distribution.Results. Statistical analysis of Big Data associated with the operation of technical systems is complicated by the heterogeneity and noisiness of such data, as well as the presence of errors and outliers of varied nature. That is primarily due to the varied loads and operating conditions of each object. Herein this problem is examined as regards the problem of evaluating the dependability indicators of a structurally complex monotonic system with independent element recovery. The paper examines methods of rejecting anomalous data and robust evaluation of the sample position parameter and performs a comparative analysis of the efficiency of such methods for various distribution laws. It is shown that robust methods of evaluation enable significantly higher accuracy as compared to the standard sample mean. The two-stage procedure based on the truncation of outliers and Tukey’s box-and-whisker plot proved to be the most efficient.Conclusions. The paper’s findings allow improving the accuracy of evaluation of dependability indicators based on complex technical system operation data. They can be used in Big Data processing and complex system dependability theory.