Abstract
The creation of a network of autonomous stations for bioindication of water bodies state requires the development of methods for analyzing large data arrays. The combination of machine learning methods with traditional statistical methods is used to identify implicit patterns in the dataset for the effect of heavy metals on natural phytoplankton. The array of experimental data consists of 465 fluorescence induction curves measured on phytoplankton samples from 9 water bodies of the Pskov region, and reflecting the dynamics of electron transfer in the photosynthetic apparatus. Each curve is characterized by 14 JIP-test parameters, some of which directly describe the shape of the curve; the others connect the shape of the curve with the energy flows that occur in the photosynthetic apparatus under illumination. Cluster analysis based on a set of JIP-test parameters was used to distinguish photosynthetic activity first among phytoplankton samples in control and then under long-term exposure to cadmium and chromium salts. In the control samples, two groups were identified that differ in the photosynthetic activity of phytoplankton. It is assumed that the lower photosynthetic activity of phytoplankton samples is associated with anthropogenic pressure on the water bodies. It was shown that the samples with initially low photosynthetic activity responded to the toxic effect of heavy metals at later periods of incubation compared to more active samples. The proposed approach can be easily scaled to analyze large arrays of experimental data that makes it a promising tool for the early detection of toxic pollution of natural waters.