Abstract
ABSTRACTRetention time (RT) alignment is one of the crucial steps in liquid chromatography mass spectrometry (LC-MS)-based proteomic and metabolomic experiments, especially for large cohort studies. And it can be achieved using computational methods; the most popular methods are the warping function method and the direct matching method. However, the existing tools can hardly handle monotonic and non-monotonic RT shifts simultaneously. To overcome this, we developed a deep learning-based RT alignment tool, named DeepRTAlign, for large cohort LC-MS data analysis. It firstly performs a coarse alignment by calculating the average time shift between any two samples and then uses RT and intensity as the main features to train its deep learning-based model. We demonstrate that DeepRTAlign has improved performances on several proteomic and metabolomic datasets especially when handling complex samples by benchmarking it against current state-of-the-art approaches. Ultimately, we show that DeepRTAlign can improve the identification sensitivity of MS data without compromising the quantitative precision compared to MaxQuant, FragPipe and DIA-NN with match between runs. In a single-cell data-independent acquisition MS dataset, DeepRTAlign can align 298 (42.7%) more peptides on average than the existing popular tool DIA-NN in each cell.
Publisher
Cold Spring Harbor Laboratory