Explainable machine learning for the identification of proteome states via the data processing kitchen sink-Reference-Cited by-同舟云学术

Explainable machine learning for the identification of proteome states via the data processing kitchen sink

Published:2023-08-31 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Scott Aaron M.^ORCID,Hartman Erik^ORCID,Malmström Johan^ORCID,Malmström Lars

Abstract

AbstractThe application of machine learning algorithms to facilitate the understanding of changes in proteome states has emerged as a promising methodology in proteomics research. Unfortunately, these methods can prove difficult to interpret, as it may not be immediately obvious how models reach their predictions. We present the data processing kitchen sink (DPKS) which provides reproducible access to classic statistical methods and advanced explainable machine learning algorithms to build highly accurate and fully interpretable predictive models. In DPKS, explainable machine learning methods are used to calculate the importance of each protein towards the prediction of a model for a particular proteome state. The calculated importance of each protein can enable the identification of proteins that drive phenotypic change in a data-driven manner while classic techniques rely on arbitrary cutoffs that may exclude important features from consideration. DPKS is a free and open source Python package available athttps://github.com/InfectionMedicineProteomics/DPKS.

Publisher

Cold Spring Harbor Laboratory

Reference22 articles.

1. Willforss, J. , Chawade, A. & Levander, F. NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis. Journal of Proteome Research (2019).

2. aLFQ: an R-package for estimating absolute protein quantities from label-free LC-MS/MS proteomics data

3. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics;Bioinformatics,2020

4. Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ;Molecular & Cellular Proteomics,2014

5. MSqRob Takes the Missing Hurdle: Uniting Intensity- and Count-Based Proteomics;Analytical Chemistry,2020

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Peptide clustering enhances large-scale analyses and reveals proteolytic signatures in mass spectrometry data;Nature Communications;2024-08-20

2. Inferring the composition of the blood plasma proteome by a human proteome distribution atlas;2024-05-10