Author:
Kumar Nishith,Hoque Md. Aminul,Sugimoto Masahiro
Abstract
AbstractMass spectrometry is a modern and sophisticated high-throughput analytical technique that enables large-scale metabolomic analyses. It yields a high-dimensional large-scale matrix (samples × metabolites) of quantified data that often contain missing cells in the data matrix as well as outliers that originate for several reasons, including technical and biological sources. Although several missing data imputation techniques are described in the literature, all conventional existing techniques only solve the missing value problems. They do not relieve the problems of outliers. Therefore, outliers in the dataset decrease the accuracy of the imputation. We developed a new kernel weight function-based proposed missing data imputation technique that resolves the problems of missing values and outliers. We evaluated the performance of the proposed method and other conventional and recently developed missing imputation techniques using both artificially generated data and experimentally measured data analysis in both the absence and presence of different rates of outliers. Performances based on both artificial data and real metabolomics data indicate the superiority of our proposed kernel weight-based missing data imputation technique to the existing alternatives. For user convenience, an R package of the proposed kernel weight-based missing value imputation technique was developed, which is available at https://github.com/NishithPaul/tWLSA.
Publisher
Springer Science and Business Media LLC
Reference35 articles.
1. Gromski, P. S. et al. Influence of missing values substitutes on multivariate analysis of metabolomics data. Metabolites 4, 433–452. https://doi.org/10.3390/metabo4020433 (2014).
2. Wei, R. et al. Missing value imputation approach for mass spectrometry-based metabolomics data. Sci. Rep. 8, 663. https://doi.org/10.1038/s41598-017-19120-0 (2018).
3. Hrydziuszko, O. & Viant, M. R. Missing values in mass spectrometry based metabolomics: an undervalued step in the data processing pipeline. Metabolomics 8, 161–174. https://doi.org/10.1007/s11306-011-0366-4 (2012).
4. Steuer, R., Morgenthal, K., Weckwerth, W. & Selbig, J. A gentle guide to the analysis of metabolomic data. In Metabolomics—Methods and Protocols (ed. Weckwerth, W.) 105–126 (Human Press, 2007).
5. Di Guida, R. et al. Non-targeted UHPLC-MS metabolomic data processing methods: a comparative investigation of normalisation, missing value imputation, transformation and scaling. Metabolomics 12, 93. https://doi.org/10.1007/s11306-016-1030-9 (2016).
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献