Abstract
AbstractBiological language modeling has significantly advanced the prediction of membrane penetration for small molecule drugs and natural peptides. However, accurately pre-dicting membrane diffusion for peptides with pharmacologically relevant modifications remains a substantial challenge. Here, we introduce PeptideCLM, a peptide-focused chemical language model capable of encoding peptides with chemical modifications, unnatural or non-canonical amino acids, and cyclizations. We assess this model by pre-dicting membrane diffusion of cyclic peptides, demonstrating greater predictive power than existing chemical language models. Our model is versatile, able to be extended beyond membrane diffusion predictions to other target values. Its advantages include the ability to model macromolecules using chemical string notation, a largely unex-plored domain, and a simple, flexible architecture that allows for adaptation to any peptide or other macromolecule dataset.
Publisher
Cold Spring Harbor Laboratory