A proteomics sample metadata representation for multiomics integration and big data analysis
-
Published:2021-10-06
Issue:1
Volume:12
Page:
-
ISSN:2041-1723
-
Container-title:Nature Communications
-
language:en
-
Short-container-title:Nat Commun
Author:
Dai Chengxin, Füllgrabe AnjaORCID, Pfeuffer Julianus, Solovyeva Elizaveta M.ORCID, Deng Jingwen, Moreno PabloORCID, Kamatchinathan Selvakumar, Kundu Deepti Jaiswal, George NancyORCID, Fexova Silvie, Grüning BjörnORCID, Föll Melanie Christine, Griss JohannesORCID, Vaudel MarcORCID, Audain EnriqueORCID, Locard-Paulet MarieORCID, Turewicz MichaelORCID, Eisenacher MartinORCID, Uszkoreit JulianORCID, Van Den Bossche TimORCID, Schwämmle VeitORCID, Webel Henry, Schulze StefanORCID, Bouyssié DavidORCID, Jayaram Savita, Duggineni Vinay Kumar, Samaras PatroklosORCID, Wilhelm MathiasORCID, Choi MeenaORCID, Wang Mingxun, Kohlbacher OliverORCID, Brazma AlvisORCID, Papatheodorou IreneORCID, Bandeira NunoORCID, Deutsch Eric W., Vizcaíno Juan AntonioORCID, Bai Mingze, Sachsenberg TimoORCID, Levitsky Lev I.ORCID, Perez-Riverol YassetORCID
Abstract
AbstractThe amount of public proteomics data is rapidly increasing but there is no standardized format to describe the sample metadata and their relationship with the dataset files in a way that fully supports their understanding or reanalysis. Here we propose to develop the transcriptomics data format MAGE-TAB into a standard representation for proteomics sample metadata. We implement MAGE-TAB-Proteomics in a crowdsourcing project to manually curate over 200 public datasets. We also describe tools and libraries to validate and submit sample metadata-related information to the PRIDE repository. We expect that these developments will improve the reproducibility and facilitate the reanalysis and integration of public proteomics datasets.
Publisher
Springer Science and Business Media LLC
Subject
General Physics and Astronomy,General Biochemistry, Genetics and Molecular Biology,General Chemistry
Reference36 articles.
1. Deutsch, E. W. et al. The ProteomeXchange consortium in 2020: enabling ‘big data’ approaches in proteomics. Nucleic Acids Res. 48, D1145–D1152 (2020). ProteomeXchange consortium manuscript including the ecosystem to discuss data sharing policies and formats in proteomics. 2. Perez-Riverol, Y. et al. The PRIDE database and related tools and resources in 2019: improving support for quantification data. Nucleic Acids Res. 47, D442–D450 (2019). PRIDE database manuscript, which has led the development and integration of MAGE-TAB-Proteomics with other EMBL-EBI resources such as BioSamples and Expression Atlas. 3. Deutsch, E. W. The peptideatlas project. Methods Mol. Biol. 604, 285–296 (2010). 4. Choi, M. et al. MassIVE.quant: a community resource of quantitative mass spectrometry-based proteomics datasets. Nat. Methods 17, 981–984 (2020). 5. Watanabe, Y., Yoshizawa, A. C., Ishihama, Y. & Okuda, S. The jPOST repository as a public data repository for shotgun proteomics. Methods Mol. Biol. 2259, 309–322 (2021).
Cited by
54 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|