Artificial intelligence-based parametrization of Michaelis–Menten maximal velocity: Toward in silico New Approach Methodologies (NAMs)-Reference-Cited by-同舟云学术

Artificial intelligence-based parametrization of Michaelis–Menten maximal velocity: Toward in silico New Approach Methodologies (NAMs)

Published:2024-04-25 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Karakoltzidis Achilleas¹,Karakitsios Spyros P.¹,Sarigiannis Dimosthenis Α.¹

Affiliation:

1. Aristotle University of Thessaloniki, Department of Chemical Engineering, Environmental Engineering Laboratory, University Campus, Thessaloniki 54124

Abstract

The development of mechanistic systems biology models necessitates the utilization of numerous kinetic parameters once the enzymatic mode of action has been identified. Moreover, wet lab experimentation is associated with particularly high costs, does not adhere to the principle of reducing the number of animal tests, and is a time-consuming procedure. Alternatively, an artificial intelligence-based method is proposed that utilizes enzyme amino acid structures as input data. This method combines NLP techniques with molecular fingerprints of the catalyzed reaction to determine Michaelis–Menten maximal velocities (Vmax). The molecular fingerprints employed include RCDK standard fingerprints (1024 bits), MACCS keys (166 bits), PubChem fingerprints (881 bits), and E-States fingerprints (79 bits). These were integrated to produce reaction fingerprints. The data were sourced from SABIO RK, providing a concrete framework to support training procedures. After the data preprocessing stage, the dataset was randomly split into a training set (70%), a validation set (10%), and a test set (20%), ensuring unique amino acid sequences for each subset. The data points with structures similar to those used to train the model as well as uncommon reactions were employed to test the model further. The developed models were optimized during training to predict Vmax values efficiently and reliably. By utilizing a fully connected neural network, these models can be applied to all organisms. The amino acid proportions of enzymes were also tested, which revealed that the amino acid content was an unreliable predictor of the Vmax. During testing, the model demonstrated better performance on known structures than on unseen data. In the given use case, the model trained solely on enzyme representations achieved an R-squared of 0.45 on unseen data and 0.70 on known structures. When enzyme representations were integrated with RCDK fingerprints, the model achieved an R-squared of 0.46 for unseen data and 0.62 for known structures.

Publisher

Research Square Platform LLC

Reference111 articles.

1. A review of Enzyme Induced Carbonate Precipitation (EICP): The role of enzyme kinetics;Ahenkorah I;Sustainable Chem,2021

2. Energetics Systems and artificial intelligence: Applications of industry 4.0;Ahmad T;Energy Rep,2022

3. Deep learning in the construction industry: A review of present status and future innovations;Akinosho TD;J Building Eng,2020

4. Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning;Alipanahi B;Nat Biotechnol,2015

5. Unified rational protein engineering with sequence-based deep representation learning;Alley EC;Nat Methods,2019