Author:
D’Addario Marianna,Kopczynski Dominik,Baumbach Jörg Ingo,Rahmann Sven
Abstract
Abstract
Background
An ion mobility (IM) spectrometer coupled with a multi-capillary column (MCC) measures volatile organic compounds (VOCs) in the air or in exhaled breath. This technique is utilized in several biotechnological and medical applications. Each peak in an MCC/IM measurement represents a certain compound, which may be known or unknown. For clustering and classification of measurements, the raw data matrix must be reduced to a set of peaks. Each peak is described by its coordinates (retention time in the MCC and reduced inverse ion mobility) and shape (signal intensity, further shape parameters). This fundamental step is referred to as peak extraction. It is the basis for identifying discriminating peaks, and hence putative biomarkers, between two classes of measurements, such as a healthy control group and a group of patients with a confirmed disease. Current state-of-the-art peak extraction methods require human interaction, such as hand-picking approximate peak locations, assisted by a visualization of the data matrix. In a high-throughput context, however, it is preferable to have robust methods for fully automated peak extraction.
Results
We introduce PEAX, a modular framework for automated peak extraction. The framework consists of several steps in a pipeline architecture. Each step performs a specific sub-task and can be instantiated by different methods implemented as modules. We provide open-source software for the framework and several modules for each step. Additionally, an interface that allows easy extension by a new module is provided. Combining the modules in all reasonable ways leads to a large number of peak extraction methods. We evaluate all combinations using intrinsic error measures and by comparing the resulting peak sets with an expert-picked one.
Conclusions
Our software PEAX is able to automatically extract peaks from MCC/IM measurements within a few seconds. The automatically obtained results keep up with the results provided by current state-of-the-art peak extraction methods. This opens a high-throughput context for the MCC/IM application field. Our software is available at http://www.rahmannlab.de/research/ims.
Publisher
Springer Science and Business Media LLC
Subject
Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology
Cited by
15 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献