Abstract
ABSTRACTNontargeted LC-MS metabolomics datasets contain a wealth of information but present many challenges during analysis and processing. Often, more than two independently processed datasets must be aligned, but no software natively allows for this. To align two or more processed nontargeted datasets, we have created an open-source Python package called Eclipse. Eclipse uses a novel subalignment approach to model the whole alignment and has built-in graph aggregation options for reporting tabular data. Each subalignment independently transforms and scales feature descriptors (retention time, mass-to-charge ratio, average feature intensity) and scores feature matches in a data driven approach. Subalignments run independently, thus could be run in parallel or over time to construct large networks. Eclipse is fast (two datasets in 7 seconds, nine datasets in 39 seconds), workflow-agnostic, and customizable even for use outside of LC-MS datasets should a need arise. Eclipse is open source and available as part of our broader processing tools BMXP (https://github.com/broadinstitute/bmxp). Eclipse can be installed via the pip command “pip install bmxp”.
Publisher
Cold Spring Harbor Laboratory