Abstract
Recent advancements in denoising diffusion models have revolutionized image, text, and video generation. Inspired by these achievements, researchers have extended denoising diffusion models to the field of molecule generation. However, existing molecular generation diffusion models are not fully optimized according to the distinct features of molecules, leading to suboptimal performance and challenges in conditional molecular optimization. In this paper, we introduce the MG-DIFF model, a novel approach tailored for molecular generation and optimization. Compared to previous methods, MG-DIFF incorporates three key improvements. Firstly, we propose a mask and replace discrete diffusion strategy, specifically designed to accommodate the complex patterns of molecular structures, thereby enhancing the quality of molecular generation. Secondly, we introduce a graph transformer model with random node initialization, which can overcome the expressiveness limitations of regular graph neural networks defined by the first-order Weisfeiler-Lehman test. Lastly, we present a graph padding strategy that enables our method to not only do conditional generation but also optimize molecules by adding certain atomic groups. In several molecular generation benchmarks, the proposed MG-DIFF model achieves state-of-the-art performance and demonstrates great potential molecular optimization.