Abstract
AbstractNon-target analysis combined with liquid chromatography high resolution mass spectrometry is considered one of the most comprehensive strategies for the detection and identification of known and unknown chemicals in complex samples. However, many compounds remain unidentified due to data complexity and limited number structures in chemical databases. In this work, we have developed and validated a novel machine learning algorithm to predict the retention index (r$$_i$$
i
) values for structurally (un)known chemicals based on their measured fragmentation pattern. The developed model, for the first time, enabled the predication of r$$_i$$
i
values without the need for the exact structure of the chemicals, with an $$R^2$$
R
2
of 0.91 and 0.77 and root mean squared error (RMSE) of 47 and 67 r$$_i$$
i
units for the NORMAN ($$n=3131$$
n
=
3131
) and amide ($$n=604$$
n
=
604
) test sets, respectively. This fragment based model showed comparable accuracy in r$$_i$$
i
prediction compared to conventional descriptor-based models that rely on known chemical structure, which obtained an $$R^2$$
R
2
of 0.85 with an RMSE of 67.
Publisher
Springer Science and Business Media LLC
Subject
Library and Information Sciences,Computer Graphics and Computer-Aided Design,Physical and Theoretical Chemistry,Computer Science Applications
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献