Affiliation:
1. College of Computer Science, Chongqing University, Chongqing, China
2. Centralchina Branch of State Grid Corporation of China, Wuhan, China
Abstract
Molecular graph representation learning has been widely applied in various domains such as drug design. It leverages deep learning techniques to transform molecular graphs into numerical vectors. Graph Transformer architecture is commonly used for molecular graph representation learning. Nevertheless, existing methods based on the Graph Transformer fail to fully exploit the topological structural information of the molecular graphs, leading to information loss for molecular representation. To solve this problem, we propose a novel molecular graph representation learning method called MTS-Net (Molecular Topological Structure-Network), which combines both global and local topological structure of a molecule. In global topological representation, the molecule graph is first transformed into a tree structure and then encoded by employing a hash algorithm for tree. In local topological representation, paths between atom pairs are transcoded and incorporated into the calculation of the Transformer attention coefficients. Moreover, MTS-Net has intuitive interpretability for identifying key structures within molecules. Experiments on eight molecular property prediction datasets show that MTS-Net achieves optimal results in three out of five classification tasks, the average accuracy is 0.85, and all three regression tasks.
Reference21 articles.
1. Graph representation learning in bioinformatics: Trends, methods and applications,bbab;Hai-Cheng Yi,;Briefings in Bioinformatics,2022
2. Tuning hyperparameters of machine learning algorithms and deep neural networks using metaheuristics: A bioinformatics study on biomedical and biological cases;Sajjad Nematzadeh,;Computational Biology and Chemistry,2022
3. Fp-gnn: A versatile deep learning architecture for enhanced molecular property prediction;Hanxuan Cai,;Briefings in Bioinformatics,2022
4. Chemical graphs, molecular matrices and topological indices in chemoinformatics and quantitative structure-activity relationships §.
5. Four new topological indices based on the molecular path code;Alexandru Balaban,;Journal of Chemical Information and Modeling,2007