MOLECULES GENERATION TO INHIBIT COVID-19 DISEASE USING ENCODER-DECODER LSTM ARCHITECTURE AND PCA PROPERTIES-Reference-Cited by-同舟云学术

MOLECULES GENERATION TO INHIBIT COVID-19 DISEASE USING ENCODER-DECODER LSTM ARCHITECTURE AND PCA PROPERTIES

Published:2022-09-13 Issue:01 Volume:23 Page:
ISSN:0219-5194
Container-title:Journal of Mechanics in Medicine and Biology
language:en
Short-container-title:J. Mech. Med. Biol.

Author:

AMRANE MERIEM¹^ORCID,OUKID SALYHA¹,ENSARI TOLGA²

Affiliation:

1. LRDSI Laboratory Computer Science Department, Universite Saad Dahlab Blida, Blida, Algeria

2. Department of Computer and Information Sciences, Arkansas Tech University, Russellville, AR, USA

Abstract

COVID-19 has become the world’s worst pandemic and has claimed over six million lives as of March 2022. The virus is now in alongside cancer as one of the most common causes of death. Likewise, there is no definitive or unique treatment for COVID-19 outside of a selected few drugs approved by the Food and Drug Administration (FDA). While Artificial Intelligence (AI) can be used to generate molecules that target Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), the virus responsible for COVID-19, such molecules are novel and do not yet exist in the market. With the emergence and availability of several drug datasets related to COVID-19 (tests, images, graphs, and ChEMBLs), recent works based on Deep Learning (DL) techniques have been employed to generate molecules and check the effectiveness of existing molecules on COVID-19. In our study, we investigated the benefits of an Encoder–Decoder (ED) architecture based on Long Short-Term Memory (LSTM) cells. As a result, the molecules were converted into a vector during the encoding phase, which was then decoded back into SMILES molecules strings. We propose an approach to incorporate four features of Principal Components Analysis (PCA) with Encoder–Decoder Long Short-Term Memory (ED-LSTM) for regularization, which means that, instead of avoiding linear mapping, we assumed that the data could be linearly separable. We concluded that ED-LSTM with unit norm constraint has the best reconstruction accuracy in the context of generating molecules. The resulting dataset was used with the aid of virtual screening and convolutional neural networks to check the drugs that have the best binding affinity with SARS-CoV-2. We achieved an accuracy of 87.35% on the test set.

Publisher

World Scientific Pub Co Pte Ltd

Subject

Biomedical Engineering

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0219519422500725

Reference43 articles.

1. A Review on the Contemporary Status of Mutating Coronavirus and Comparative Literature Study of Current COVID-19 Vaccines

2. COVID-19: A review of the proposed pharmacological treatments

3. How artificial intelligence is changing drug discovery

4. Network-based drug repurposing for novel coronavirus 2019-nCoV/SARS-CoV-2