Can machine learning extract the mechanisms controlling phytoplankton growth from large-scale observations? – A proof-of-concept study
-
Published:2021-03-19
Issue:6
Volume:18
Page:1941-1970
-
ISSN:1726-4189
-
Container-title:Biogeosciences
-
language:en
-
Short-container-title:Biogeosciences
Author:
Holder Christopher, Gnanadesikan AnandORCID
Abstract
Abstract. A key challenge for biological oceanography is relating the physiological
mechanisms controlling phytoplankton growth to the spatial distribution of
those phytoplankton. Physiological mechanisms are often isolated by varying
one driver of growth, such as nutrient or light, in a controlled laboratory
setting producing what we call “intrinsic relationships”. We contrast
these with the “apparent relationships” which emerge in the environment in
climatological data. Although previous studies have found machine learning
(ML) can find apparent relationships, there has yet to be a systematic study
examining when and why these apparent relationships diverge from the
underlying intrinsic relationships found in the lab and how and why this may depend on the method applied. Here we conduct a proof-of-concept study
with three scenarios in which biomass is by construction a function of
time-averaged phytoplankton growth rate. In the first scenario, the inputs
and outputs of the intrinsic and apparent relationships vary over the
same monthly timescales. In the second, the intrinsic relationships relate
averages of drivers that vary on hourly timescales to biomass, but the
apparent relationships are sought between monthly averages of these inputs
and monthly-averaged output. In the third scenario we apply ML to the output
of an actual Earth system model (ESM). Our results demonstrated that when
intrinsic and apparent relationships operate on the same spatial and
temporal timescale, neural network ensembles (NNEs) were able to extract the
intrinsic relationships when only provided information about the apparent
relationships, while colimitation and its inability to extrapolate resulted in random forests (RFs) diverging from the true response. When
intrinsic and apparent relationships operated on different timescales (as
little separation as hourly versus daily), NNEs fed with apparent
relationships in time-averaged data produced responses with the right shape
but underestimated the biomass. This was because when the intrinsic
relationship was nonlinear, the response to a time-averaged input differed
systematically from the time-averaged response. Although the limitations
found by NNEs were overestimated, they were able to produce more realistic
shapes of the actual relationships compared to multiple linear regression.
Additionally, NNEs were able to model the interactions between predictors
and their effects on biomass, allowing for a qualitative assessment of the
colimitation patterns and the nutrient causing the most limitation. Future
research may be able to use this type of analysis for observational datasets
and other ESMs to identify apparent relationships between biogeochemical
variables (rather than spatiotemporal distributions only) and identify
interactions and colimitations without having to perform (or at least
performing fewer) growth experiments in a lab. From our study, it appears
that ML can extract useful information from ESM output and could likely do
so for observational datasets as well.
Funder
Division of Ocean Sciences Division of Graduate Education
Publisher
Copernicus GmbH
Subject
Earth-Surface Processes,Ecology, Evolution, Behavior and Systematics
Reference60 articles.
1. Bahl, A., Gnanadesikan, A., and Pradal, M.-A.: Variations in Ocean
Deoxygenation Across Earth System Models: Isolating the Role of
Parameterized Lateral Mixing, Global Biogeochem. Cy., 33, 703–724,
https://doi.org/10.1029/2018GB006121, 2019. 2. Belochitski, A., Binev, P., DeVore, R., Fox-Rabinovitz, M., Krasnopolsky, V.,
and Lamby, P.: Tree approximation of the long wave radiation
parameterization in the NCAR CAM global climate model, J. Comput. Appl.
Math., 236, 447–460, https://doi.org/10.1016/j.cam.2011.07.013, 2011. 3. Bourel, M., Crisci, C., and Martínez, A.: Consensus methods based on
machine learning techniques for marine phytoplankton presence–absence
prediction, Ecol. Inform., 42, 46–54, https://doi.org/10.1016/j.ecoinf.2017.09.004,
2017. 4. Boyd, P. W., Jickells, T., Law, C. S., Blain, S., Boyle, E. A., Buesseler,
K. O., Coale, K. H., Cullen, J. J., de Baar, H. J. W., Follows, M., Harvey,
M., Lancelot, C., Levasseur, M., Owens, N. P. J., Pollard, R., Rivkin, R.
B., Sarmiento, J., Schoemann, V., Smetacek, V., Takeda, S., Tsuda, A.,
Turner, S., and Watson, A. J.: Mesoscale Iron Enrichment Experiments
1993–2005: Synthesis and Future Directions, Science, 315, 612–617,
2007. 5. Breiman, L.: Random forests, Mach. Learn., 45, 5–32, 2001.
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|