Affiliation:
1. Department of Life Science Informatics, Bonn-Aachen International Center for Information Technology Rheinische Friedrich-Wilhelms-Universität Bonn Friedrich-Hirzebruch-Allee 5/6 D-53115 Bonn Germany
2. Lamarr Institute for Machine Learning and Artificial Intelligence Rheinische Friedrich-Wilhelms-Universität Bonn Friedrich-Hirzebruch-Allee 5/6 D-53115 Bonn Germany
Abstract
AbstractIn drug discovery, chemical language models (CLMs) originating from natural language processing offer new opportunities for molecular design. CLMs have been developed using recurrent neural network (RNN) or transformer architectures. For the predictive performance of RNN‐based encoder‐decoder frameworks and transformers, attention mechanisms play a central role. Among others, emerging application areas for CLMs include constrained generative modeling and the prediction of chemical reactions or drug‐target interactions. Since CLMs are applicable to any compound or target data that can be presented in a sequential format and tokenized, mappings of different types of sequences can be learned. For example, active compounds can be predicted from protein sequence motifs. Novel off‐the‐beat‐path applications can also be considered. For example, analogue series from medicinal chemistry can be perceived and represented as chemical sequences and extended with new compounds using CLMs. Herein, methodological features of CLMs and different applications are discussed.
Subject
Organic Chemistry,Computer Science Applications,Drug Discovery,Molecular Medicine,Structural Biology
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献