Abstract
Abstract
The molecular structure is closely linked to its properties. While graph representations of molecules have become popular due to the non-Euclidean nature of compound structures, they may not encompass as rich semantic information as molecular sequence representations. This can lead to potential conflicts in semantic features between different representations within neural networks. To address this issue, we propose a contrastive learning framework that combines molecular graphs with molecular fingerprints. Firstly, we employ clustering algorithms to group molecules and obtain cluster centers. Subsequently, we utilize these cluster centers for contrastive learning, allowing the model to learn molecular structural information on unlabeled data. Additionally, we introduce a self-attention mechanism during the graph pooling process to selectively extract graph features. Experimental results demonstrate that our model achieves an average improvement of 2.04% in ROC-AUC over the previous state-of-the-art models in molecular property classification tasks, validating the effectiveness of our computational framework.
Publisher
Research Square Platform LLC
Reference43 articles.
1. Drug discovery in pharmaceutical industry: productivity challenges and trends;Khanna I;Drug Discov. Today,2012
2. The rise of deep learning in drug discovery;Chen HM;Drug Discov. Today,2018
3. Computational toxicology in drug development;Muster W;Drug Discov Today,2008
4. Russell, S. J. & Norvig, P. Artificial intelligence a modern approach. (London, 2010).
5. The CNN paradigm;Chua LO;IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications,1993