Abstract
Abstract. Accurate specification of the error statistics required for data assimilation remains an ongoing challenge, partly because their estimation is an
underdetermined problem that requires statistical assumptions. Even with the common assumption that background and observation errors are
uncorrelated, the problem remains underdetermined. One natural question that could arise is as follows: can the increasing amount of overlapping observations
or other datasets help to reduce the total number of statistical assumptions, or do they introduce more statistical unknowns? In order to answer
this question, this paper provides a conceptual view on the statistical error estimation problem for multiple collocated datasets, including a
generalized mathematical formulation, an illustrative demonstration with synthetic data, and guidelines for setting up and solving the
problem. It is demonstrated that the required number of statistical assumptions increases linearly with the number of datasets. However, the number
of error statistics that can be estimated increases quadratically, allowing for an estimation of an increasing number of error cross-statistics
between datasets for more than three datasets. The presented generalized estimation of full error covariance and cross-covariance matrices between
datasets does not necessarily accumulate the uncertainties of assumptions among error estimations of multiple datasets.